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Title: US-09-8 69-4 14A-4 

Perfect score: 2664 
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Scoring table : 


BLOSUM62 

Gapop 10.0 , Gapext 0.5 


Searched: 1107863 seqs, 158726573 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 
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Post-processing : 
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: * 


qp-embl/AA1980 . DAT 
: qp-embl/AA1981.DAT: - 
: qp-embl/AA1982 . DAT : * 
:qp-embl/AAl983.DAT: * 
:qp-embl/AA1984 . DAT : * 
:qp-embl/AA1985 . DAT : * 
; qp - emb 1 / AA 1 9 8 6 . DAT : 
:qp-embl / AA1 9 8 7 . DAT : 
tqp- embl / AA1 9 8 8 . DAT : 
. eqp-embl /AAl 9 8 9 . DAT . 
,eqp-embl/AA1990.DAT: 
;eqp-embl/AA1991 . DAT : * 
eqp-embl/AA1992 . DAT : * 
eqp-embl /AAl 9 93 . DAT : * 
eqp-embl /AAl 9 94 . DAT : * 
eqp-embl/AAl995 . DAT : * 
eqp-embl /AAl 9 9 6. DAT : * 
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eqp-embl/AA1998 . DAT : * 
eqp~embl/AAl999 . DAT : * 
eqp-embl/AA2000 . DAT : * 
:qp-embl/AA2001 . DAT : 
eqp-embl /AA2 002 . DAT : 
eqp-embl /AA2 0 0 3 . DAT : 


Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 


and is derived by analysis of the total score distribution. 
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ALIGNMENTS 


RESULT 1 
AAY88425 

ID AAY88425 standard; Protein; 501 AA. 
XX 

AC AAY88425; 
XX 

DT 03-AUG-2000 (first entry) 
XX 

DE Human aspartyl protease 2 (a) (Asp2) amino acid sequence. 
XX 

KW Aspartyl protease; aspartase; amyloid precursor protein; APP; Asp 2; 

KW Alzheimer's disease; beta secretase site. 

XX 

OS Homo sapiens . 
XX 

PN WO200017369-A2 . 
XX 

PD 30-MAR-2000. 
XX 

PF 23-SEP-1999; 99WO-US2 08 81 . 
XX 

PR 24-SEP-1998; 98US-0101594 . 
XX 

PA (PHAA ) PHARMACIA & UPJOHN CO. 
XX 

PI Gurney ME, Bienkowski MJ, Heinrikson RL, Parodi LA, Yan R; 
XX 

DR WPI; 2000-303209/26. 

DR N-PSDB; AAA15662. 
XX 

PT New enzyme designated human aspartase useful in research into 

PT Alzheimer's Disease is capable of cleaving amyloid protein precursor at 

PT the beta secretase site to produce amyloid beta peptide 

XX 

PS Claim 48; Fig 2; 183pp; English. 
XX 

CC This sequence represents the human aspartyl protease 2 (Asp2) amino acid 

CC sequence. The invention relates to a protease (e.g. Asp2) capable of 

CC cleaving the beta secretase site of amyloid precursor protein (APP) . The 

CC protease contains a sequence encoding the amino acid sequence DTG and a 

CC sequence encoding DSG or DTG separated by 100-300 amino acids. When 

CC mutated the APP gene causes an autosomal dominant form of Alzheimer's 

CC disease. APP localises to the cell surface membrane and have a single 

CC C-terminal transmembrane domain. Proteolytic processing of APP produces 

CC the amyloid beta protein, which is possibly very important in Alzheimer's 

CC disease. The invention includes a nucleotide sequence encoding the 

CC protease, a vector containing the nucleotide sequence, and a cell line 

CC comprising the vector. Methods for screening for inhibitors of beta 

CC secretase activity are also given in the invention. The human aspartase 

CC protein and nucleotide sequences and the methods for identifying 

CC inhibitors of the protease, are useful in the treatment of and research 

CC in to Alzheimer's disease. 

XX 

SQ Sequence 501 AA; 


Query Match 100.0%; 
Best Local Similarity 100.0%; 


Score 2664; DB 21; Length 501; 
Pred. No. 3.8e-263; 


Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDN L RGK S GQ G Y YVEMT VG S P P QT LN I L VDT G S S N FAVGAAP H P FLH R Y YQ RQ L S S T 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 
Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DH S L YTGS LWYT P I RREWYYEVI I VRVEI NGQDLKMDCKE YNYDKS I VDS GTTNLRLP KK 300 

I I I I I M | I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPI RREWYYEVI IVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I | | | | | I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3 61 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HV1IDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFWLDMEDCGYN I PQTDESTLMT I AYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I II I I I I I I I I I I I I I 
Db 481 RCLRCLRQQHDDFADDISLLK 501 


RESULT 2 
AAE10629 

ID AAE10629 standard; Protein; 501 AA. 
XX 

AC AAE10629; t , 
XX 

DT 10-DEC-2001 (first entry) 
XX 

DE Human aspartyl protease 2(a) [hu-Asp2(a)] protein. 
XX 

KW Human; aspartyl protease 2(a); Asp2 (a) ; amyloid precursor protein; APP; 

KW Alzheimer's disease; AD; dementia; neurofibrillary tangle; gliosis; 

KW amyloid plaque; neuronal loss; proteolytic; nootropic; neuroprotective; 

KW chromosome llq23 . 3-24 . 1 . 
XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 


FT Peptide 1. .21 

FT /label= Signal_jpepticle 

FT Peptide 22. .45 

FT /label= Asp_2a_prepropeptide 

FT Peptide 46. .57 

FT /label= Asp_2a_propeptide 

FT Protein 58.. 501 

FT /label= Mature_human_Asp_2a_protein 

FT Region 420.. 454 

FT /label= Alpha-helical_spacer_region 

FT Domain 455. .477 

FT /label= Transmembrane_domain 

FT Domain 478 . . 501 

FT /label= Cytoplasmic_domain 

XX 


PN GB2357767-A. 
XX 


PD 

04- 

JUL- 

2001 



XX 






PF 

22- 

SEP- 

2000; 2000GB- 

0023315 

XX 






PR 

23- 

SEP- 

1999, 

99US- 

0155493 

PR 

23- 

SEP- 

1999, 

99US- 

0404133 

PR 

23- 

SEP- 

1999, 

99WO- 

US20881 

PR 

13- 

OCT- 

1999, 

99US- 

0416901 

PR 

06- 

DEC- 

1999, 

; 99US- 

0169232 


XX 

PA (PHAA ) PHARMACIA & UPJOHN CO. 
XX 

PI Bienkowkski MJ, Gurney M; 
XX 

DR WPI; 2001-444208/48. 

DR N-PSDB; AAD17865. 
XX 

PT Polypeptide comprising fragments of human aspartyl protease with 

PT amyloid precursor protein processing activity and alpha-secretase 

PT activity, for identifying modulators useful in treating Alzheimer's 

PT disease - 
XX 

PS Example 2; Fig 2; 187pp; English. 
XX 

CC The patent discloses human aspartyl protease 1 (hu-Aspl) or modified 

CC Aspl proteins which lack transmembrane domain or amino terminal 

CC domain or cytoplasmic domain and retains alpha-secretase activity 

CC and amyloid protein precursor (APP) processing activity. The proteins 

CC of the invention are useful for assaying hu-Aspl alpha-secretase 

CC activity, which in turn is useful for identifying modulators of 

CC hu-Aspl alpha-secretase activity, where modulators that increase 

CC hu-Aspl alpha-secretase activity are useful for treating Alzheimer's 

CC disease (AD) which causes progressive dementia with consequent 

CC formation of amyloid plaques, neurofibrillary tangles, gliosis and 

CC neuronal loss. Hu-Aspl protease substrate is useful for assaying 

CC hu-Aspl proteolytic actiOity, by contacting hu-Aspl protein with 

CC the substrate under acidic conditions and determining the level of 

CC hu-Aspl proteolytic activity. The present sequence is long form of 

CC human Asp2 protein, designated as Asp2(a). Asp2 gene is localised 

CC on chromosome llq23 . 3-24 . 1 . 


XX 

SQ Sequence 501 AA; 


Query Match 100.0%; Score 2664; DB 22; Length 501; 

Best Local Similarity 100.0%; Pred. No. 3.8e-263; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I M 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

I I I I II I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 

Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 18 0 

I | | | I I M I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 

Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DHS LYT GS LWYT P I RREW YYEVI I VRVEI NGQDLKMDCKE YN YDKS I VDS GTTNLRLPKK 300 

I I I I I I I I I I I I I I I M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI SLYLMGEVTNQS FRIT 360 

M I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI SLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 
Db 361 I L P QQ Y L RP VE D VAT S Q D D C Y K FAI S Q S S T GT VMG AVI ME G F YWF D RARK RI G FAVS AC 420 

Qy 421 HVH D E FRTAAVE G P FVT L DME DC G YN I PQT D E S T LMT I AYVMAAI CAL FML P LC LMVCQW 480 

I | I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I I I I I I M I I I I 

Db 481 RCLRCLRQQHDDFADDISLLK 501 


RESULT 3 
AAE06859 

ID AAE06859 standard; Protein; 501 AA. 
XX 

AC AAE06859; 
XX 

DT 23-OCT-2001 (first entry) 
XX 

DE Human aspartyl protease 2a (Hu-Asp2a) protein. 
XX 

KW Human; aspartyl protease 2a; Asp 2a; beta-amyloid precursor protein; APP; 

KW beta-secretase; Alzheimer's disease; dementia; amyloid plaque; gliosis; 

KW neurofibrillary tangle; neuronal loss; amyloid-beta peptide; nootropic; 


KW neuroprotective; antisense therapy; gene therapy; 

KW chromosome llq23 . 3-24 . 1 . 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT Peptide 1. .21 

FT /label= Signal_peptide 

FT Protein 22 ..501 

FT /note- "Mature human aspartyl protease 2a (Hu-Asp2a) " 

FT Region 420.. 454 

FT /note= "Alpha helical spacer region" 

FT Domain 455. . 477 

FT /labels Transmembrane^domain 

FT Domain 478.. 501 

FT /label= Cytoplasmic_domain 
XX 

PN WO200150829-A2. 
XX 

PD 19-JUL-2001. 
XX 

PF 09-MAY-2 001; 2001WO-IB00799 . 
XX 

PR 09-MAY-2001; 2001WO-IB00799 . 
XX 

PA (BIEN/) BIENKOWSKI M J. 

PA (GURN/) GURNEY M E. 

PA (HEIN/) HEINRIKSON R L. 

PA (PARO/) PARODI L A. 

PA (YANR/) YAN R. 

XX 

PI Bienkowski MJ, Gurney ME, Heinrikson RL, Parodi LA, Yan R; 
XX 

DR WPI; 2001-483072/52. 

DR N-PSDB; AAD13021. 
XX 

PT Novel purified polypeptide comprising fragment of mammalian aspartyl 

PT protease 2, lacking Asp2 transmembrane domain and retaining beta 

PT secretase activity of Asp2 useful for identifying inhibitors of Asp2 

PT activity - 

XX 

PS Claim 49; Fig 2; 185pp; English. 
XX 

CC The invention relates to human aspartyl proteases (Hu-Asp) , beta-amyloid 

CC precursor protein (APP) isoforms and their corresponding DNA molecules. 

CC Human aspartyl proteases can act as beta-secretase proteases useful for 

CC treating Alzheimer ! s disease. APP isoforms are useful for identifying 

CC modulators of amyloid-beta peptide production, for use in designing 

CC therapeutics for the treatment and prevention of Alzheimer's disease, 

CC dementia, formation of amyloid plaques, neurofibrillary tangles, gliosis 

CC and neuronal loss. APP isoforms are also used in methods for identifying 

CC inhibitors and modulators of human Asp2 activity. The invention relates 

CC to a method for identifying agents that modulate the activity of human 

CC aspartyl protease Asp2 . Amyloid-beta peptides obtained from APP are used 

CC as a means to screen in cellular assays for the inhibitors of beta- and 

CC gamma- secretase. Hu-Asp DNA fragments are useful as probes or primers in 

CC polymerase chain reactions (PCR) . The probes are useful for detecting 


CC Hu-Asp nucleic acids in in vitro assays and in Northern and Southern 

CC blots. The present sequence is human aspartyl protease 2 (Hu-Asp2) , a 

CC 'long 1 form designated as (Hu-Asp2a) . Hu-Asp 2 gene is localised on 

CC chromosome llq23 . 3-24 . 1 . 

XX ; 

SQ Sequence 501 AA; 

Query Match 100.0%; Score 2664; DB 22; Length 501; 

Best Local Similarity 100.0%; Pred. No. 3.8e-263; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I ! I M I I 

Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPHP FLHRYYQRQLS ST 120 

I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I II I I I I I I I I I I II I I I 
Db 61 VEMVDNLRGKS GQGYYVEMTVGS P PQTLN I LVDTGS SN FAVGAAPHP FLHRYYQRQLS ST 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIT^ITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I M I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I M I I I I I I I I I I I M I I I I I I I I I I I 
Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DHS LYTGS LW YT P I RREWY YEVI I VRVEI NGQDLKMDCKEYN YDKS I VDS GTTNLRL PKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I II I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNI PQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 


481 RCLRCLRQQHDDFADDISLLK 501 


RESULT 4 
AAU06603 

ID AAU06603 standard; Protein; 501 AA. 
XX 

AC AAU06603; 
XX 

DT 24-OCT-2001 (first entry) 
XX 

DE Human Aspartyl protease 2 (a) f Asp2(a). 


XX 

KW Human; Aspartyl protease; Asp2 (a) ; beta-secretase; nootropic; 

KW neuroprotective; amyloid protein precursor; APP; Alzheimer's disease; 

KW amyloid-beta; Abeta. 

XX 

OS Homo sapiens. 


XX 

FH Key Location/Qualif iers 

FT Peptide 1..21 

FT /label= Signal peptide 

FT Peptide 22 . . 45 

FT /label= Pre_pro_peptide 

FT Peptide 46. .57 

FT /label= Pro_peptide 

FT Protein 57. .501 

FT /label= Mature_Asp2 ( a) 

FT Region 420. .454 

FT /label= Alpha_helical_spacer_region 

FT Domian 455. .477 

FT /label= Transmembrane_domain 

FT Domian 478.. 501 

FT /label= Cytoplasmic_domain 

XX 


PN WO200149098-A2. 
XX 

PD 12-JUL-2001. 
XX 

PF 09-MAY-2001; 2001WO-IB007 98 . 
XX 

PR 09-MAY-2001; 2001WO-IB007 98 . 
XX 

PA (BIEN/) BIENKOWSKI M J. 

PA (GURN/) GURNEY M E . 

PA (HEIN/) HEINRIKSON R L. 

PA (PARO/) PARODI L A. 

PA (YANR/) YAN R. 

XX 

PI Bienkowski MJ, Gurney ME, Heinrikson RL, Parodi LA, Yan R; 
XX 

DR WPI; 2001-502549/55. 

DR N-PSDB; AAS11517. 
XX 

PT Novel purified polypeptide comprising fragment of mammalian aspartyl 

PT protease 2, lacking Asp2 transmembrane domain and retaining beta 

PT secretase activity of Asp2 useful for identifying inhibitors of Asp2 

PT activity 

XX 

PS Claim 49; Fig 2; 185pp; English. 
XX 

CC The invention relates to a purified polypeptide comprising a fragment of 

CC mammalian aspartyl protease (Asp) 2 protein which lacks the Asp2 

CC transmembrane domain and the Asp2 protein, and where the polypeptide and 

CC the fragment retain the beta-secretase activity of the mammalian Asp2 

CC protein. The invention also details polynucleotides for the Asp 

CC proteins and vectors expressing them, and a polypeptide (isoform of 

CC amyloid protein precursor (APP) ) comprising the amino acid sequence of an 

CC APP or its fragment containing an APP cleavage site recognizable by a 


CC mammalian beta-secretase, and further comprising two lysine residues at 

CC the carboxyl terminus of the amino acid sequence of the mammalian APP or 

CC APP fragment. Also included in the invention are methods of identifying 

CC modulators or inhibitors of Asp2 . Modulators and inhibitors of Asp2 are 

CC useful for treating Alzheimer ! s disease. APP is useful in methods for 

CC identifying inhibitors or modulators of human Asp2 activity and 

CC amyloid-beta (Abeta) peptide production. APP is also useful in designing 

CC therapeutics for the treatment or prevention of Alzheimer ! s disease. 

CC APP comprising the APP-Sw-beta-secretase peptide sequence (NLDA) , which 

CC is associated with increased levels of Abeta processing is useful in 

CC assays relating the Alzheimer's research. The expression vector is useful 

CC for recombinantly expressing APP. Nucleic acids that hybridise to 

CC Asp oligonucleotides are useful as probes or primers. The probes are 

CC useful for detecting Hu-Asp nucleic acids in in vitro assays and in 

CC Northern and Southern blots. The present sequence is human Asp2(a), 

XX 

SQ Sequence 501 AA; 


Query Match 100.0%; Score 2664; DB 22; Length 501; 

Best Local Similarity 100.0%; Pred. No. 3.8e-263; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 


Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKS GQGYYVEMTVGS P PQTLNI LVDTGS SN FAVGAAPH P FLHRYYQRQLS ST 120 

I I I M I I I I I I I I I I I I I I I I I I M I 1 I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VEMVDNLRGKSGQGYYVEMTVGSP PQTLNI LVDTGS SNFAVGAAPHP FLHRYYQRQLS ST 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI SLYLMGEVTNQSFRIT 360 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I M I I I I I I M I I I I I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWN I FPVI S LYLMGEVTNQS FRI T 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I M I I I II [ I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I I II I I I I I I I I I 

Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I M I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 4 80 


Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I I I I I I I II I I I I I I I I I I 
Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 5 
AAU07202 

ID AAU07202 standard; Protein; 501 AA. 
XX 

AC AAU07202; 
XX 

DT 24-OCT-2001 (first entry) 
XX 

DE Human aspartyl protease 2a (Asp-2a) . 
XX 

KW Human; aspartyl protease 1; Asp-1; nootropic; neuroprotective; 

KW aspartyl protease 2; Asp2; amyloid protein precursor; APP; 

KW beta-secretase; Alzheimer's disease. 
XX 

OS Homo sapiens. 


XX 

FH Key Location/Qualifiers 

FT Peptide 1. .21 

FT /note= "Signal peptide" 

FT Misc_feature 22.. 45 

FT /note= "Pre-propeptide" 

FT Misc_feature 46.. 57 
FT /note= "Propeptide" 

FT ' Protein 58 . . 501 

FT /note= "Mature Aspartyl protease-2a" 

FT Region 420. .454 

FT /note= "Alpha helical spacer region" 

FT Domain 4 55-477 

FT /note= "Transmembrane domain" 

FT Domain 478 . . 501 

FT /note= "Cytoplasmic domain" 

XX 


PN WO200149097-A2 . 
XX 

PD 12-JUL-2001. 
XX 

PF 09-MAY-2001; 2001WO-IB00797 . 
XX 

PR 09-MAY-2001; 2001WO-IB00797 . 
XX 

PA (BIEN/) BIENKOWSKI M J. 

PA (GURN/) GURNEY M E. 

PA (HEIN/) HEINRIKSON R L. 

PA (PARO/) PARODI L A. 

PA (YANR/) YAN R. 

XX 

PI Bienkowski MJ, Gurney ME, Heinrikson RL, Parodi LA, Yan R; 
XX 

DR WPI; 2001-502548/55. 

DR N-PSDB; AAS11702. 
XX 

PT Novel purified polypeptide comprising fragment of mammalian aspartyl 

PT protease 2, lacking Asp2 transmembrane domain and retaining beta 

PT secretase activity of Asp2 useful for identifying inhibitors of Asp2 

PT activity - 


PS Claim 49; Fig 2; 185pp; English. 
XX 

CC The invention relates to a novel purified polypeptide comprising a 

CC fragment of mammalian aspartyl protease 2 (Asp2) protein which lacks the 

CC Asp2 transmembrane domain and the Asp2 protein, and where the polypeptide 

CC and the fragment retain the beta-secretase activity of the mammalian Asp2 

CC protein. Also included is an isoform of amyloid protein precursor (APP) 

CC comprising the amino acid sequence of a APP or its fragment containing 

CC an APP cleavage site recognisable by a mammalian beta-secretase, and 

CC further comprising two lysine residues at the carboxyl terminus of the 

CC amino acid sequence of the mammalian APP or APP fragment. The 

CC polypeptides are used for assaying for modulators of beta-secretase 

CC activity; identifying agents that inhibit the APP processing activity 

CC of human Asp2 aspartyl protease (Hu-Asp2) ; identifying agents that 

CC modulate the activity of Asp2; and for reducing cellular production of 

CC amyloid beta (Abeta) from APP. Agents identified by the above methods 

CC are useful for treating Alzheimer's disease; and for identifying 

CC modulators of amyloid-beta (Abeta) peptide production, for use in 

CC designing therapeutics for the treatment or prevention of Alzheimer's 

CC disease. Probes and primers derived from Asp nucleic acid sequences 

CC are useful for detecting Hu-Asp nucleic acids in in vitro assays and in 

CC Northern and Southern blots. The present sequence represents the 

CC amino acid sequence of human Asp-2a used in the methods of the invention. 

XX 

SQ Sequence 501 AA; 

Query Match 100.0%; Score 2664; DB 22; Length 501; 

Best Local Similarity 100.0%; Pred. No. 3.8e-263; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0 


Qy 

1 

MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 

60 


I 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 


Db 

1 

MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 

60 

Qy 

61 

VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 

120 


I I I 1 I 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 


Db 

61 

VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 

120 

Qy 

121 

YRDLRKGVWPYTQGKWEGELGTDLVSIPHGPNWVRANIAAITESDKFFINGSNWEGIL 

180 


I I I I I I I I I 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 


Db 

121 

YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 

180 

Qy 

181 

GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 

240 


[ 1 1 1 1 1 I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 

181 

GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMI1GGI 

240 

Qy 

241 

DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 

300 


I | | | | | | 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 


Db 

241 

DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 

300 

Qy 

301 

VFEAAVKS I KAAS STEKFP DGFWLGEQLVCWQAGTT PWNI FPVI S LYLMGEVTNQS FRI T 

360 


1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 


Db 

301 

VFEAAVKS I KAAS STEKFP DGFWLGEQLVCWQAGTT PWNI FPVI S LYLMGEVTNQS FRIT 

360 

Qy 

361 

I L PQQ YLRP VE DVAT S Q D D C YK FAI S Q S S T GT VMGAVIMEG F YWF DRARKRI G FAVS AC 

420 


I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 


Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 


Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I II 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 4 80 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I II I I I I M II I I I M I I I 
Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 6 
AAE02581 

ID AAE02581 standard; Protein; 501 AA. 
XX 

AC AAE02581; 
XX 

DT 10-AUG-2001 {first entry) 
XX 

DE Human aspartyl protease 2a (Asp 2a) . 
XX 

KW Human; alpha-secretase; amyloid precursor protein; APP; therapy; 

KW Alzheimer's disease; antialzheimer 1 s ; aspartyl protease 2a; Asp 2a; 

KW beta-secretase; chromosome llq23 . 3-24 . 1 . 
XX 

OS Homo sapiens . 


XX 

FH Key Location/Qualifiers 

FT Peptide 1..21 

FT /label= Signal_peptide 

FT Peptide 22.. 45 

FT /label- Asp_2a_prepropeptide 

FT Peptide 46.. 57 

FT /label= Asp_2a_propeptide 

FT Protein 58.. 501 

FT /label= Mature_human_Asp_2a_protein 

FT Active-site 93 . . 95 

FT /label= Active_site_l 

FT Active-site 289.. 291 

FT /label= Active_site_2 

FT Region 420. . 454 

FT /label= Alpha_helical__spacer 

FT Domain 455. . 477 

FT /labels Transmembrane_domain 

FT Domain 478.. 501 

FT /label- Cytoplasmicjdomain 

FT Region 486.. 501 

FT /note= "Peptide #2" 

XX 


PN WO200123533-A2. 
XX 

PD 05-APR-2001. 
XX 

PF 22-SEP-2000; 2000WO-US26080 . 
XX 

PR 23-SEP-1999; 99US-0155493 . 

PR 23-SEP-1999; 99WO-US20881 . 


PR 13-OCT-1999; 99US-0416901 . 

PR 06-DEO1999; 99US-0 169232 . 
XX 

PA (PHAA ) PHARMACIA & UPJOHN CO. 
XX 

PI Gurney M, Bienkowski MJ; 
XX 

DR WPI; 2001-290516/30. 

DR N-PSDB; AAD06739. 
XX 

PT Enzymes that cleave the alpha-secretase site of the amyloid precursor 

PT protein, useful for the treatment of Alzheimer's disease - 

XX 

PS Example 2; Fig 2; 189pp; English. 
XX 

CC The present invention relates to enzymes for cleaving the alpha- 

CC secretase site of the amyloid precursor protein (APP) and methods of 

CC identifying those enzymes. The methods may be used to identify enzymes 

CC that may be used to cleave the alpha-secretase cleavage site of the APP 

CC protein. The enzymes may be used to treat or modulate the progress of 

CC Alzheimer's disease. The present sequence is human aspartyl protease 2a 

CC (Asp 2a) . Asp 2a has beta-secretase protease activity. Asp2 gene 

CC is located on chromosome llq2 3 . 3-24 . 1 . 

XX 

SQ Sequence 501 AA; 

Query Match 100.0%; Score 2664; DB 22; Length 501; 

Best Local Similarity 100.0%; Pred. No. 3.8e-263; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0, 

1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I | | I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I II I II 
1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I | | | | | | I I I I I I II I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I 

61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 18 0 

I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANI7\AITESDKFFINGSNWEGIL 180 

181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I II I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I 
181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I M I ! I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I M I I I I 
241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWN I FPVI SLYLMGEVTNQS FRIT 360 

I M I I I I I I I I I M I I I I I I I I I I II I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 
301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I M I I I I II I II II I II 
361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 


QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 


Qy 421 HWDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 480 

I I I I I I I I I I [ I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMT^I CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I I I I I I I I I I I I I I I I I I I 
Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 7 


ABB7J 

3590 


ID 

ABB78590 standard; Protein; 501 AA. 

XX 



AC 

ABB78590; 


XX 



DT 

16-JUL-2002 

(first entry) 

XX 



DE 

Human Asp-2(a) protein sequence SEQ ID NO: 4. 

XX 



KW 

Human; Asp-1; Asp-2; aspartyl protease; enzyme; Alzheimer's disease; 

KW 

proteolytic, 

chromosome llq23 . 3-24 . 1 . 

XX 



OS 

Homo sapiens. 

XX 



PN 

GB2367060-A. 

XX 



PD 

27-MAR-2002 


XX 



PF 

29-OCT-2001; 2001GB-0025934 . 

XX 



PR 

23-SEP-1999, 

; 99US-155493P. 

PR 

23-SEP-1999, 

99US-0404133. 

PR 

23-SEP-1999, 

; 99WO-US20881. 

PR 

13-OCT-1999, 

r 99US-0416901. 

PR 

06-DEC-1999, 

? 99US-169232P. 

PR 

22-SEP-2000 

; 2000GB-0023315. 

XX 



PA 

(PHAA ) PHARMACIA & UPJOHN CO. 

XX 



PI 

Bienkowkski 

MJ, Gurney M; 

XX 



DR 

WPI; 2002-396337/43. 

DR 

N-PSDB; ABL52457. 

XX 



PT 

Human aspartyl protease 1 substrates useful in assays to detect 

PT 

aspartyl protease activity, e.g. for the diagnosis of Alzheimer's 

PT 

disease - 


XX 



PS 

Example 2; Fig 2; 182pp; English. 

XX 



CC 

The present 

invention describes a human aspartyl protease 1 (hu-Aspl) 

CC 

substrate (I) which comprises a peptide of no more than 50 amino acids, 

CC 

and which comprises the 8 amino acid sequence Gly-Leu-Ala-Leu-Ala-Leu- 

CC 

Glu-Pro. Also described are: (1) a method (II) for assaying hu-Aspl 

CC 

proteolytic 

activity, comprising: (a) contacting a hu-Aspl protein with 

CC 

(I) under acidic conditions; and (b) determining the level of hu-Aspl 


CC proteolytic activity; (2) a purified polynucleotide (III) comprising a 

CC nucleotide sequence that hybridises under stringent conditions to the 

CC non-coding strand complementary to a defined 1804 nucleotide sequence 

CC (see ABL52456) where the nucleotide sequence encodes a polypeptide having 

CC Aspl proteolytic activity and lacks nucleotides encoding a transmembrane 

CC domain); (3) a purified polynucleotide (III 1 ) comprising a sequence that 

CC hybridises under stringent conditions to (III) (the nucleotide sequence 

CC encodes a polypeptide further lacking a pro-peptide domain corresponding 

CC to amino acids 23-62 of hu-Aspl (see ABB78589) ) ; (4) a vector (IV) 

CC comprising (III) or (III 1 ); and (5) a host cell (V) transformed or 

CC transfected with (III), (III 1 ) and/or (IV). The hu-Aspl protease 

CC substrate (I) may be used as an enzyme substrate in assays to detect 

CC aspartyl protease activity, (II) and therefore diagnose diseases 

CC associated with aberrant hu-Aspl expression and activity such as 

CC Alzheimer f s disease. Hu-Aspl has been localised to chromosome 21, while 

CC hu-Asp2 has been localised to chromosome llq23 . 3-24 . 1 . The present 

CC sequence represents hu-Asp2(a) from the present invention. 

XX 

SQ Sequence 501 AA; 

Query Match 100.0%; Score 2664; DB 23; Length 501; 

Best Local Similarity 100.0%; Pred. No. 3.8e-263; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 


Qy 

1 

MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 

bU 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 > 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 


Db 

1 

MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 

60 

Qy 

61 

VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 

120 


1 | 1 1 1 1 1 1 1 1 I | 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 

61 

VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 

120 

Qy 

121 

YRDLRKGVTVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 

180 


I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 

121 

YRDLRKGVTVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 

180 

Qy 

181 

GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 

240 


I I I I I I I M | I I I 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 I 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 > 1 1 


Db 

181 

GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 

240 

Qy 

241 

DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 

300 


1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 

241 

DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 

300 

Qy 

301 

VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI SLYLMGEVTNQS FRIT 

360 


I | | | | | | 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 

301 

VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI SLYLMGEVTNQS FRIT 

360 

Qy 

361 

ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 

420 



1 I I I I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 


Db 

361 

ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 

420 

Qy 

421 

HVHDEFRT7UVVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 

480 


1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M 1 I 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 

421 

HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 

480 


Qy 


4 81 RCLRCLRQQHDDFADDISLLK 501 


Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 8 
ABB06409 

ID ABB06409 standard; Protein; 501 AA. 
XX 

AC ABB06409; 
XX 

DT 31-MAY-2002 (first entry) 
XX 

DE Human aspartyl protease protein sequence SEQ ID NO: 2. 
XX 

KW Beta-secretase; enzyme; cleavage site; amyloid protein precursor; APP; 

KW aspartyl protease; neuroprotective; nootropic; beta-secretase inhibitor; 

KW Alzheimer's disease. 
XX 

OS Homo sapiens . 
XX 

PN WO200206306-A2. 
XX 

PD 24-JAN-2002. 
XX 

PF 19-JUL-2001; 2 001WO-US23035 . 
XX 

PR 19-JUL-2000; 2000US-219795P . 

PR 12-MAR-2001; 2 001US-275251P . 
XX 

PA (PHAA ) PHARMACIA & UPJOHN CO. 
XX 

PI Yan R, Tomasselli AG, Gurney ME, Emmons TL, Bienkowski MJ; 

PI Heinrikson RL; 

XX 

DR WPI; 2002-216995/27. 

DR N-PSDB; ABL49914. 
XX 

PT Novel substrates for human aspartyl protease useful for identifying 

PT modulators of beta secretase activity of aspartyl protease for treating 

PT Alzheimer's disease 

XX 

PS Claim 63; Page 118-119; 188pp; English. 
XX 

CC The present invention describes an isolated peptide (I) comprising a 

CC sequence of at least four amino acids, where the peptide is a substrate 

CC for conducting aspartyl protease assays. (I) has neuroprotective and 

CC nootropic activities, and can be used as an inhibitor of beta-secretase 

CC activity. A beta-secretase modulator from the present invention can be 

CC used for inhibiting beta-secretase activity in vivo, and in the 

CC manufacture of a medicament for the treatment of Alzheimer's disease. 

CC Pharmaceutical compositions from the present invention can be used for 

CC treating a disease or condition characterised by an abnormal beta- 

CC secretase activity. (I) is useful for identifying agents that modulate 

CC the activity of human Asp2 aspartyl protease (Hu-Asp2) . (I) is useful 

CC as a core structure to construct derivatives. ABL49914 to ABL49925 and 

CC ABB06409 to ABB06593 represent sequences used in the exemplification 

CC of the present invention. 


XX 

SQ Sequence 501 AA; 


Query Match 100.0%; Score 2664; DB 23; Length 501; 

Best Local Similarity 100.0%; Pred. No. 3.8e-263; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I M I I I II M I I II I I I I I I I I I I M I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I I 

Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIT^AITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I M I 
Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I | | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I'M I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I 

Db 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I | | I I I 1 I I I I I I I II I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVTiDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 48 0 

I I I I I I I I I I I I II I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I II I II I I I II I I I 
Db 481 RCLRCLRQQHDDFADDISLLK 501 


RESULT 9 


AAY94767 


ID 

AAY94767 standard; Protein; 501 AA. 


XX 



AC 

AAY94767; 


XX 



DT 

12-FEB-2001 (first entry) 


XX 



DE 

Human beta-secretase amino acid sequence. 


XX 



KW 

Beta-secretase; enzyme; amyloid plaque; Alzheimer 1 

s disease; human; 

KW 

Down's syndrome; amyloid angiopathy; gene therapy; 

neuroprotective . 

XX 




OS 
XX 
FH 
FT 
FT 
FT 
FT 


Homo sapiens. 


Key 

Peptide 


Location/Qualifiers 
1..45 


Protein 


/label= putative signal peptide 
46. .501 

/label= Beta-secretase 


XX 

PN WO200058479-A1. 
XX 

PD 05-OCT-2000. 
XX 

PF 23-MAR-2000; 2 000WO-US07755 . 
XX 

PR 26-MAR-1999; 99US-02 77229 . 
XX 

PA (AMGE-) AMGEN INC. 
XX 

PI Citron M, Vassar RJ, Bennett BD; 
XX 

DR WPI; 2000-594643/56. 

DR N-PSDB; AAA28278. 
XX 

PT Isolated beta-secretase nucleic acids and encoded polypeptides, useful 

PT for diagnosis and gene therapy of Alzheimer's disease - 

XX 

PS Claim 1; Fig 4; 145pp; English. 
XX 

CC This invention relates to 3 nucleotide sequences encoding beta-secretase 

CC proteins. Beta-secretase is an enzyme involved in the production of one 

CC of the components of amyloid plaques involved in Alzheimer's disease. The 

CC invention includes an expression vector comprising the nucleotide 

CC sequence, a host cell comprising the expression vector, and a process for 

CC producing the protein through culturing the transformed cells. TVlso 

CC included in the invention are a polypeptide derivative of the 

CC beta-secretase protein, a fusion protein comprising beta-secretase fused 

CC to a heterologous amino acid sequence, and a method for modulating the 

CC levels of beta-secretase polypeptide in a mammal comprising administering 

CC the polynucleotide sequence. Beta-secretase exhibits neuroprotective and 

CC nootropic activity. The beta-secretase nucleotide sequence may be used to 

CC map locations of the beta-secretase gene and related genes on chromosomes 

CC and as hybridization probes in diagnostic assays to test for the presence 

CC of beta-secretase DNA or RNA, such as in Alzheimer's disease, Down 1 s 

CC syndrome, and amyloid angiopathy. The nucleotide sequence may also be 

CC used as anti-sense inhibitors of beta-secretase expression, in gene 

CC therapy of Alzheimer's disease, and for the identification of compounds 

CC that modulate beta-secretase activity. Antibodies to the beta-secretase 

CC protein may be used for in vitro and in vivo diagnostic purposes to 

CC detect the presence of beta-secretase polypeptide in a body fluid or cell 

CC sample. The present sequence represents the human beta-secretase protein. 

XX 

SQ Sequence 501 AA; 


Query Match 99.7%; Score 2656; DB 21; 

Best Local Similarity 99.8%; Pred. No. 2.5e-262; 
Matches 500; Conservative 0; Mismatches 1; 


Length 501; 

Indels 0; Gaps 


0; 


Qv 

1 

Db 

1 

Ov 

61 

Db 

61 

Ov 

121 

Db 

121 

Ov 

181 

Db 

181 

Ov 

241 

Db 

241 

Ov 

301 

Db 

301 

Ov 

361 

Db 

361 

Qy 

421 

Db 

421 

Qy 

481 

Db 

481 


MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

| | M | | | | | I I II I I I I I I I I I I I I I I I II I I I I I I M I I I I I I II I I I I I I I I I I I I I I 
MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGT^PHPFLHRYYQRQLSST 12 0 

I I I I E I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 18 0 

I I I II I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 18 0 

GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I 

DHS L YTGS LWYT P I RREWY YEVI I VRVEINGQDLKMDCKE YNYDKS I VDS GTTNLRLPKK 300 

VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 
VFEAAVK S I KAAS STEKFPDG FWL GEQ LVCWQ AGT T P WN I F P VI S L Y LMGE VT NQ S FR I T 360 

ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

HVHDEFRTAAVEGPFvTLDMEDCGYNIPQTDESTLMTIAYVM^ICALFMLPLCLMVCQW 480 

I | | I | I I j I I I I I I I I I I I I I I I I i I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I M I 
HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I I I I I I I I I I I I 

RCLRCLRQQHDDFADDISLLK 501 


RESULT 10 
AAB07896 

ID AAB07896 standard; Protein; 501 AA. 
XX 

AC AAB07896; 
XX 

DT 14-NOV-2000 (first entry) 
XX 

DE Amino acid sequence of a human beta-secretase enzyme. 
XX 

KW Beta-secretase; beta-amyloid precursor protein; beta-amyloid peptide; 
KW amyloid plaque component; Alzheimer's disease; amyloidogenic disease; 
KW inhibitor . 
XX 

OS Homo sapiens . 
XX 

PN WO200047618-A2 . 
XX 

PD 17-AUG-2000. 
XX 


PF 10-FEB-2000; 2000WO-US03 819 . 
XX 

PR 10-FEB-1999; 99US-011957 1 . 

PR 15-JUN-1999; 99US-0139172 . 
XX 

PA (ELAN-) ELAN PHARM INC. . 
XX 

PI Anderson JP, Basi G, Doane MT, Frigon N, John V, Power M; 

PI Sinha S, Tatsuno G, Tung J, Wang S, McConlogue L; 

XX 

DR WPI; 2000-533011/48. 

DR N-PSDB; AAA59550, AAA59551. 

XX 

PT Purified beta-secretase protein used in assays to discover inhibitors 

PT which can be used for the treatment of amyloidogenic diseases e.g. 

PT Alzheimer's disease - 
XX 

PS Claim 17; Fig 2A; 121pp; English. 
XX 

CC The specification describes a beta-secretase enzyme. The enzyme cleaves 

CC beta-amyloid precursor protein to produce beta-amyloid peptide. This 

CC enzyme is therefore implicated in the production of amyloid plaque 

CC components which accumulate in the brains of individuals afflicted with 

CC Alzheimer's disease. Inhibitors of beta-secretase are administered to 

CC a mammalian subject e.g. with Alzheimer's disease or Alzheimer's 

CC disease-like pathology to test if they maintain or improve cognitive 

CC ability or reduce the plaque burden. The compounds are used for the 

CC treatment of amyloidogenic diseases e.g. Alzheimer's disease. The 

CC present sequence represents a human beta-secretase enzyme. 

XX 

SQ Sequence 501 AA; 

Query Match 99.7%; Score 2656; DB 21; Length 501; 
Best Local Similarity 99.8%; Pred. No. 2.5e-262; 

Matches 500; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 


QY 

l 

MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 

60 



I | | | M | | | | 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 M 1 1 1 1 I 1 1 1 1 1 1 1 1 


Db 

l 

MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 

60 

Qy 

61 

VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 

120 


1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 

61 

VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 

120 

Qy 

121 

YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVR7\NIAAITESDKFFINGSNWEGIL 

180 


1 I I | 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 


Db 

121 

YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 

180 

Qy 

181 

GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 

240 


M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 


Db 

181 

GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 

240 

Qy 

241 

DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 

300 


I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 


Db 

241 

DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 

300 

Qy 

301 

VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 

360 


I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I 

Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I II I II I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 

Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I I I II I I I I I I I 
Db 481 RCLRCLRQQHDDFADDISLLK 501 


RESULT 11 

ABG7i 

3374 

ID 

ABG78374 standard; Protein; 501 AA. 

XX 


AC 

ABG78374; 

XX 


DT 

15-NOV-2002 (first entry) 

XX 


DE 

Human prepromemapsin 2 . 

XX 


KW 

Human; enzyme; memapsin 2; aspartic protease; beta secretase; 

KW 

degenerative disease; Alzheimer^ disease; amyloid precursor protein; 

KW 

APP; neuroprotective; nootropic; inhibitors- 

KW 

substrate side-chain preference. 

XX 


OS 

Homo sapiens. 

XX 


PN 

WO200253594-A2 . 

XX 


PD 

ll-JUL-2002. 

XX 


PF 

28-DEC-2001; 2001WO-US50826 . 

XX 


PR 

28-DEC-2000; 2000US-258705P . 

PR 

14-MAR-2001; 2001US-275756P . 

XX 


PA 

(OKLA-) OKLAHOMA MEDICAL RES FOUND. 

PA 

(UNI I ) UNIV ILLINOIS FOUND. 

XX 


PI 

Tang JJN, Koelsch G, Ghosh AK; 

XX 


DR 

WPI; 2002-619088/66. 

XX 


PT 

New memapsin 2 activity inhibitor useful in treatment of e.g. 

PT 

Alzheimer 1 s disease 

XX 


PS 

Disclosure; Fig 9; 74pp; English. 

XX 


CC 

The invention relates to an inhibitor of catalytically active memapsin 2 

CC 

(an aspartic protease which can cleave at beta secretase sites), which 

CC 

binds to the active site of memapsin 2 defined by the presence of two 


CC catalytic aspartic residues and substrate binding cleft. Also 

CC included is a method of determination of the substrate side-chain 

CC preference in memapsin 2 sub-sites comprising: (a) reacting a mixture of 

CC memapsin 2 substrates with memapsin 2, and determining the sub-site 

CC preference of memapsin 2 by determining relative initial hydrolysis rates 

CC of the mixture of memapsin 2 substrates; or (b) preparing a combinatorial 

CC library of memapsin 2 inhibitors containing a base sequence taken from 

CC OM99-2 (Glu-Val-Asn-Leu-Ala-Ala-Glu-phe) , probing the library of 

CC inhibitors with memapsin 2 which binds to several inhibitors to generate 

CC several bound memapsin 2, and detecting the bound memapsin 2 with an 

CC antibody raised to memapsin 2 and an alkaline phosphatase conjugated 

CC secondary antibody. The inhibitors may be used in the manufacture of a 

CC medicament for the treatment of Alzheimer's disease since memapsin 2 may 

CC be involved in the cleavage of amyloid precursor protein (APP) , and for 

CC determining the substrate side-chain preference in memapsin 2 sub-sites. 

CC The present sequence represents human memapsin 2 (either prepromemapsin 2 

CC or mature memapsin) . 

XX 

SQ Sequence 501 AA; 


Query Match 99.7%; Score 2656; DB 23; Length 501; 

Best Local Similarity 99.8%; Pred. No. 2.5e-262; 

Matches 500; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I It I I I I I I I I I I I I I I II II I I I I I I M I I I I I M I I I I I I I I II I I I I I I I I I M I i I 

Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 


Qy 61 VEMVDN L RG K S GQ G Y YVEMT VG S P P QT LN I L VDT G S S N FAVGAAP H P FLH R Y YQ RQ L S S T 120 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I I I M I 

Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I Ml I I I I II I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I > 

Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 


Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I M I I I I I I I I I I I I I I I I I M II I I I I I I I I II I II I I I I I I I I I I 1 I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI S LYLMGEVTNQS FRIT 360 

I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI S LYLMGEVTNQS FRIT 360 


Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I II I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 480 

I I I M I II I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 421 HVHD E FRTAAVEG P FVT L DME DC G YN IPQTDEST LMT I AYVMAAI C AL FML PLC LMVC QW 480 


Qy 


4 81 RCLRCLRQQHDDFADDI SLLK 501 


Db 4 81 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 12 
AAM52697 

ID AAM52697 standard; Protein; 509 AA. 
XX 

AC AAM52697; 
XX 

DT 26-FEB-2002 (first entry) 
XX 

DE FLAG-tagged human beta-secretase . 
XX 

KW Human; beta-secretase; FLAG tag; inhibitor; amine compound; 

KW beta amyloid protein production; head injury; spinal injury; 

KW amyloid precursor protein alpha secretion; nerve damage; 

KW meningitis sequela; cerebral paralysis; memory disorder; 

KW mental disease; nootropic; neuroprotective; cerebroprotective . 

XX 

OS Homo sapiens . 

OS Synthetic. 
XX 

FH Key Location/Qualifiers 

FT Region 502.. 509 

FT /label- FLAG_tag 

XX 

PN WO200187293-A1. 
XX 

PD 22-NOV-2001. 
XX 

PF 18-MAY-2001; 2 001WO- JP04 144 . 
XX 

PR 19-MAY-2000; 2000 JP-0152758 . 
XX 

PA (TAKE ) TAKEDA CHEM IND LTD. 
XX 

PI Miyamoto M, Matsui J, Fukumoto H, Tarui N; 
XX 

DR WPI; 2002-055640/07. 

DR N-PSDB; ABA02406. 
XX 

PT Beta-secretase inhibitor used for treating e.g. Alzheimer's disease and 

PT injury to brain or spine, and neurodegeneration, comprises amine 

PT compound - 
XX 

PS Examples; Page 79-81; 86pp; Japanese. 
XX 

CC The invention relates to novel amine compounds which are beta-secretase 

CC inhibitors. The beta-secretase compounds also have the ability to 

CC promote amyloid precursor protein alpha secretion and to inhibit beta 

CC amyloid protein production. The beta-secretase inhibitors of the 

CC invention can be used for treating head or spinal injuries, nerve damage, 

CC sequelae of meningitis, cerebral paralysis, memory disorders and mental 

CC diseases. The present sequence represents a FLAG-tagged human 

CC beta-secretase used in the exemplifications of the invention. 

XX 


SQ Sequence 509 AA; 


Query Match 99.7%; Score 2656; DB 23; Length 509; 

Best Local Similarity 99.8%; Pred. No. 2.6e-262; 

Matches 500; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGY WEMTVGS P PQTLNI LVDTGS SNFAVGAAPHPFLHRYYQRQLS ST 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 61 VEMVDNLRGKSGQGYYVEMTVGSP PQTLNI LVDTGS SNFAVGAAPHPFLHRYYQRQLS ST 12 0 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLWQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I i I I I I I I I I II I I I I I I I I I I I I I i I I I M I I I I I I I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS ST EKFPDGFWLGEQLVCWQAGTT PWN I FP VI S L YLMGEVTNQS FRI T 360 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKS I KAAS ST EKFPDGFWLGEQLVCWQAGTT PWN I F P VI SL YLMGEVTNQS FRI T 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I i I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II i I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

Qy 421 HV11DEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I I I I I I I I I I I I I I I I I I I 
Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 13 
AAW59807 

ID AAW59807 standard; Protein; 501 AA. 
XX 

AC AAW59807; 
XX 

DT 26-OCT-1998 (first entry) 
XX 

DE Amino acid sequence of human ASP2 (aspartic protease 2). 
XX 

KW Human; ASP2; aspartic protease 2; agonist; antagonist; immunospecif ic; 

KW antibody; inhibition; Alzheimer's disease; cancer; proteinase; 

KW prohormone processing. 
XX 


OS Homo sapiens. 
XX 

PN EP855444-A2. 
XX 

PD 29-JUL-1998. 
XX 

PF 27-JAN-1998; 98EP-0300573 ■ 
XX 

PR 28-JAN-1997; 97GB-0001684 . 
XX 

PA (SMIK ) SMITHKLINE BEECHAM CORP. 

PA (SMIK ) SMITHKLINE BEECHAM PLC. 
XX 

PI Chapman CG, Murphy K, Powell DJ, Smith TS; 
XX 

DR WPI; 1998-389809/34. 

DR N-PSDB; AAV41696. 
XX 

PT New nucleic acid encoding human aspartic protease 2 - used to treat, 

PT prevent and diagnose e.g. Alzheimer's disease, cancer and prohormone 

PT processing 
XX 

PS Claim 1; Page 7; 2 6pp; English. 
XX 

CC This is the amino acid sequence of the human ASP2 (aspartic protease 

CC family) , used in the method of the invention. Agonists and 

CC antagonists for ASP2 immunospecif ic antibodies are used to treat 

CC conditions requiring increased or decreased activity or expression of 

CC ASP2 respectively. ASP2 is used to treat and diagnose e.g. 

CC Alzheimer's disease, cancer and prohormone processing and ASP2 or a 

CC fragment can be used to induce an immune response against the above 

CC conditions . 

XX 

SQ Sequence 501 AA; 

Query Match 99.5%; Score 2650; DB 19; Length 501; 
Best Local Similarity 99.6%; Pred. No. le-261; 

Matches 4 99; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I I I I I I I I I I M I I I I II I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 YRDLRKGVYEPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 240 


Qy 


241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 
I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 


Db 


241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 


Qy 301 VFEAAVKSII^AASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I II I I I I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I I I I I I I I I I I I I I I I I II 
Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 14 
ABG09611 

ID ABG09611 standard; Protein; 969 AA. 
XX 

AC ABG09611; 
XX 

DT 13-FEB-2002 (first entry) 
XX 

DE Novel human diagnostic protein #9602. 
XX 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder. 
XX 

OS Homo sapiens. 
XX 

PN WO200175067-A2 . 
XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2001WO-US08631 . 
XX 

PR 31-MAR-2000; 2 0 00US-054 02 17 . 

PR 23-AUG-2000; 2 000US-064 9167 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR N-PSDB; AAS73798. 
XX 

PT New isolated polynucleotide and encoded polypeptides, useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity 
XX 

PS Claim 20; SEQ ID No 39970; 103pp; English. 
XX 


CC The invention relates to isolated polynucleotide (I) and 

CC polypeptide (II) sequences. (I) is useful as hybridisation probes, 

CC polymerase chain reaction (PCR) primers, oligomers, and for chromosome 

CC and gene mapping, and in recombinant production of (II) . The 

CC polynucleotides are also used in diagnostics as expressed sequence tags 

CC for identifying expressed genes. (I) is useful in gene therapy techniques 

CC to restore normal activity of (II) or to treat disease states involving 

CC (II). (II) is useful for generating antibodies against it, detecting or 

CC quantitating a polypeptide in tissue, as molecular weight markers and as 

CC a food supplement. (II) and its binding partners are useful in medical 

CC imaging of sites expressing (II). (I) and (II) are useful for treating 

CC disorders involving aberrant protein expression or biological activity. 

CC The polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. ABG00010-ABG30377 represent novel human 

CC diagnostic amino acid sequences of the invention. 

CC Note: The sequence data for this patent did not appear in the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp.wipo.int/pub/published_pct_sequences. 
XX 

SQ Sequence 969 AA; 

Query Match 97.2%; Score 2588.5; DB 22; Length 969; 

Best Local Similarity 98.0%; Pred. No. 6e-255; 

Matches 4 92; Conservative 0; Mismatches 9; Indels 1; Gaps 1; 
Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 


Db 


1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 


Qy 


Db 


61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I I I I I I I M I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 


Qy 


121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 


Db 


121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 


Qy 


Db 


181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I 

181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 240 


Qy 


Db 


241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I i I I I II I I I 
241 DHSLYTGSLWYTPIRRESYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 


Qy 


Db 


301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVT SLYLMGEVTNQS FRI T 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEGTNQS FRIT 360 


Qy 


Db 


361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGLLVSAC 42 0 


Qy 


421 HVHDEFRTAAVEGPFWLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 


Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHD-DFADDI SLLK 501 

I I I I I I I I I I I I I I I I I 

Db 481 RCLRCLRQQHGMTLPDDI S LLK 502 


RESULT 15 
AAB66572 
ID 
XX 


AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
PR 
PR 
PR 
XX 
PA 
PA 
XX 
PI 
XX 
DR 
DR 
XX 
PT 
PT 
PT 
PT 
XX 
PS 
XX 
CC 
CC 
CC 
CC 
CC 
CC 
CC 


AAB66572 standard; Protein; 488 AA. 
AAB66572; 

12-APR-2001 (first entry) 
Human memapsin 2. 


Human; memapsin 
APP; memapsin 2 

Homo sapiens. 

WO200100665-A2 . 

04-JAN-2001. 


27-JUN-2000; 2000WO-US17742 . 


2; nootropic; neuroprotective; amyloid precursor protein; 
inhibitor; Alzheimer's disease. 


28-JUN-1999; 
30-NOV-1999; 
25-JAN-2000; 
27-JAN-2000; 
08-JUN-2000; 


99US-0141363. 

99US-0168060. 
2000US-0177836. 
2000US-0178368. 
2000US-0210292. 


(OKLA-) OKLAHOMA MEDICAL RES FOUND. 
(UNII ) UNIV ILLINOIS FOUND. 

Tang JJN, Hong L, Ghosh AK; 

WPI; 2001-137933/14. 
N-PSDB; AAF31848. 

Novel memapsin 2 inhibitors which bind to active site of memapsin 2 
having 2 catalytic aspartic residues and substrate binding cleft, used 
to treat Alzheimer's disease by blocking amyloid precursor protein 
cleavage 

Example 1; Page 72-74; 86pp; English. 

The present sequence is given in a specification relating to an inhibitor 
of catalytically active memapsin 2. The inhibitor binds to the memapsin 2 
active site, which is defined by the presence of two catalytic aspartic 
residues and a substrate binding cleft. The inhibitor is useful for 
the treatment and diagnosis of Alzheimer's disease. It is useful in 
screens for individuals with a genetic predisposition to Alzheimer's 
disease. The inhibitor is useful as a reagent for specifically binding to 


CC memapsin 2 or memapsin 2 analogues and for aiding in memapsin 2 

CC isolation, purification and characterisation. 

XX 

SQ Sequence 488 AA; 

Query Match 96.9%; Score 2582; DB 22; Length 488; 

Best Local Similarity 99.8%; Pred. No. 9e-255; 

Matches 487; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 14 AGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQ 73 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 AGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQ 60 

Qy 74 GY WEMTVGS PPQTLNI LVDTGS SNFAVGAAPHPFLHRYYQRQLS STYRDLRKGVYVPYT 133 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I i M 
Db 61 GY WEMTVGS PPQTLNI LVDTGS SNFAVGAAPHPFLHRYYQRQLS STYRDLRKGVYVPYT 12 0 

Qy 134 QGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDS 193 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

Db 121 QGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDS 180 

Qy 194 LEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTP 253 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 181 LEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTP 24 0 

Qy 254 IRREWYYEVI IVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAAS 313 

I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 241 IRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAAS 300 

Qy 314 STEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDV 373 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 STEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDV 360 

Qy 374 ATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHVHDEFRTAAVEG 433 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I 

Db 361 ATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHVHDEFRTAAVEG 42 0 

Qy 434 PFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICTUjFMLPLCLMVCQWRCLRCLRQQHDDF 493 

I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 PFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQWRCLRCLRQQHDDF 48 0 

Qy 494 ADDISLLK 501 

I I I I I I II 
Db 481 ADDISLLK 488 


Search completed: January 21, 2004, 09:22:24 
Job time : 133.195 sees 


GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 


OM protein - protein search, using sw model 
Run on: 


Title: 
Perfect score: 2664 


January 21, 2004, 09:19:55 ; Search time 45.0229 Seconds 

(without alignments) 
470.821 Million cell updates/sec 

US-09-869-414A-4 


Sequence : 


1 MAQAL PWL L LWMGAGVL PAH CLRCLRQQHDDFADDISLLK 501 


328717 


Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 328717 seqs, 42310858 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_AA: * 

1 : /cgn2_6/ptodata/ l/iaa/5A_COMB . pep : * 

2 : /cgn2_6/ptodata/l/iaa/5B_COMB.pep: * 

3: /cgn2_6/ptodata/l/iaa/6A_COMB.pep: * 

4 : /cgn2_6/ptodata/l/iaa/6B__COMB.pep: * 

5 : /cgn2_6/ptodata/ l/iaa/PCTUS_COMB . pep : * 

6 : / cgn2_6/ ptodata/ 1/iaa/backf ilesl . pep : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 


Result 
No. 

Score 

Query 
Match 

Length 

DB 

ID 




Description 


1 

2656 

99.7 

501 

4 

US- 

09- 

548- 

372D-4 

Sequence 

4, 

Appli 

2 

2656 

99.7 

501 

4 

us- 

09- 

548- 

367D-4 

Sequence 

4, 

Appli 

3 

2656 

99.7 

501 

4 

us- 

09- 

551- 

853D-4 

Sequence 

4, 

Appli 

4 

2650 

99.5 

501 

4 

US- 

09- 

009- 

191-2 

Sequence 

2, 

Appli 

5 

2582 

96.9 

488 

4 

US- 

09- 

604- 

608-2 

Sequence 

2, 

Appli 

6 

2582 

96.9 

501 

4 

us- 

09- 

713- 

158-2 

Sequence 

2, 

Appli 

7 

2582 

96.9 

503 

4 

US- 

09- 

604- 

608-3 

Sequence 

3, 

Appli 

8 

2567 

96.4 

501 

4 

US- 

09- 

548- 

372D-8 

Sequence 

8, 

Appli 

9 

2567 

96.4 

501 

4 

us- 

09- 

548- 

367D-8 

Sequence 

8, 

Appli 

10 

2567 

96.4 

501 

4 

us- 

09- 

551- 

853D-8 

Sequence 

8, 

Appli 

11 

2506.5 

94.1 

476 

4 

us- 

09- 

548- 

372D-6 

Sequence 

6, 

Appli 


12 

2506.5 

94 

.1 

476 

4 

US- 

09- 

548 

-367D-6 

Sequence 

6, 

Appli 

13 

2506.5 

94 

.1 

476 

4 

US- 

09- 

551 

-853D-6 

Sequence 

6, 

Appli 

14 

2420.5 

90 

.9 

476 

4 

US- 

09- 

548 

-372D-73 

Sequence 

73, 

Appl 

15 

2420.5 

90 

9 

476 

4 

us- 

09- 

548 

-367D-73 

Sequence 

73, 

Appl 

16 

2420.5 

90 

9 

476 

4 

us- 

09- 

551 

-853D-73 

Sequence 

73, 

Appl 

17 

2397 

90 

0 

453 

4 

US- 

09- 

548 

-372D-30 

Sequence 

30, 

Appl 

18 

2397 

90 

0 

453 

4 

us- 

09- 

548 

-367D-30 

Sequence 

30, 

Appl 

19 

2397 

90 

0 

453 

4 

us- 

09- 

551 

-853D-30 

Sequence 

30, 

Appl 

20 

2397 

90 

0 

459 

4 

us- 

09- 

548 

-372D-32 

Sequence 

32, 

Appl 

21 

2397 

90 

0 

459 

4 

us- 

09- 

548 

-367D-32 

Sequence 

32, 

Appl 

22 

2397 

90 

0 

459 

4 

us- 

09- 

551 

-853D-32 

Sequence 

32, 

Appl 

23 

2315 

86 

9 

774 

4 

us- 

09- 

009 

-191-4 

Sequence 

4, 

Appli 

24 

2291.5 

86 

0 

446 

4 

us- 

09- 

548 

-372D-22 

Sequence 

22, 

Appl 

25 

2291.5 

86 

0 

446 

4 

us- 

09- 

548 

-367D-22 

Sequence 

22, 

Appl 

26 

2291.5 

86 

0 

446 

4 

us- 

09- 

551 

-853D-22 

Sequence 

22, 

Appl 

27 

2288 

85 

9 

433 

4 

us- 

09- 

548 

-372D-26 

Sequence 

26, 

Appl 

28 

2288 

85 

9 

433 

4 

us- 

09- 

548 

-367D-26 

Sequence 

26, 

Appl 

29 

2288 

85 

9 

433 

4 

us- 

09- 

551 

-853D-26 

Sequence 

26, 

Appl 

30 

2288 

85 

9 

459 

4 

us- 

09- 

548 

-372D-24 

Sequence 

24, 

Appl 

i XT 

31 

2288 

85 

9 

459 

4 

us- 

09- 

548 

-367D-24 

Sequence 

24, 

Appl 

32 

2288 

85 

9 

459 

4 

us- 

09- 

551 

-853D-24 

Sequence 

24, 

Appl 

33 

2247.5 

84 

4 

428 

4 

us- 

09- 

548 

-372D-51 

Sequence 

51, 

Appl 

34 

2247.5 

84 

4 

428 

4 

us- 

09- 

548 

-367D-51 

Sequence 

51, 

Appl 

35 

2247.5 

84 

4 

428 

4 

us- 

09- 

551 

-853D-51 

Sequence 

51, 

Appl 

36 

2247.5 

84 

4 

434 

4 

us- 

09- 

548 

-372D-53 

Sequence 

53, 

Appl 

37 

2247.5 

84 

4 

434 

4 

us- 

09- 

548 

-367D-53 

Sequence 

53, 

Appl 

38 

2247.5 

84 

4 

434 

4 

us- 

09- 

551 

-853D-53 

Sequence 

53, 

Appl 

39 

2104 

79 

0 

425 

4 

us- 

09- 

548 

-372D-28 

Sequence 

28, 

Appl 

40 

2104 

79 

0 

425 

4 

us- 

09- 

548 

-367D-28 

Sequence 

28, 

Appl 

41 

2104 

79 

0 

425 

4 

us- 

09- 

551 

-853D-28 

Sequence 

28, 

Appl 

42 

1173.5 

44 

1 

518 

3 

us- 

08- 

999 

-723-2 

Sequence 

2, 

Appli 

43 

1173.5 

44 

1 

518 

3 

us- 

09- 

434 

-427-2 

Sequence 

2, 

Appli 

44 

1173.5 

44 

1 

518 

4 

us- 

09- 

548 

-372D-2 

Sequence 

2, 

Appli 

45 

1173.5 

44 

1 

518 

4 

us- 

09- 

548 

-367D-2 

Sequence 

2, 

Appli 


ALIGNMENTS 


RESULT 1 

US-09-548-372D-4 

; Sequence 4, Application US/09548372D 

; Patent No. 6420534 

; GENERAL INFORMATION: 

; APPLICANT: GURNEY ET AL . 

; TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 
; FILE REFERENCE: 29915/62801 

; CURRENT APPLICATION NUMBER: US/ 09/54 8 , 372D 
; CURRENT FILING DATE: 2000-04-12 

PRIOR APPLICATION NUMBER: US 60/155,493 

PRIOR FILING DATE: 1999-09-23 

PRIOR APPLICATION NUMBER: US 09/404,133 
; PRIOR FILING DATE: 1999-09-23 

PRIOR APPLICATION NUMBER: PCT/US99/20881 
; PRIOR FILING DATE: 1999-09-23 


PRIOR APPLICATION NUMBER: US 60/101,594 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 7 3 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 4 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-548-372D-4 

Query Match 99.7%; Score 2656; DB 4; Length 501; 

Best Local Similarity 99.8%; Pred. No. 7.4e-267; 

Matches 500; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I II I I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I i I I I I II I I I I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I II I I I I I I I I 
Db 61 VEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPHP FLHRYYQRQLS ST 12 0 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 18 0 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 GLAY7VEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I 
Db 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFWLDMEDCGYNIPQTDESTIJyiTIAYVMAAICALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I I I I I I I I I I I I 
Db 481 RCLRCLRQQHDDFADDISLLK 501 


RESULT 2 

US-09-548-367D-4 

; Sequence 4, Application US/09548367D 

; Patent No. 6440698 

; GENERAL INFORMATION: 

; APPLICANT: GURNEY ET AL. 


; TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

TITLE OF INVENTION: THEREOF 
FILE REFERENCE: 29915/6280H 

CURRENT APPLICATION NUMBER: US/ 09/548, 367D 
CURRENT FILING DATE: 2000-04-12 
PRIOR APPLICATION NUMBER: US 60/155,493 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: US 09/404,133 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: PCT/US99/2088 1 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: US 60/101,594 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 4 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-548-367D-4 

Query Match 99.7%; Score 2656; DB 4; Length 501; 

Best Local Similarity 99.8%; Pred. No. 7.4e-267; 

Matches 500; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGK S GQ G Y YVEMT VG S P P QT LN I LVDT G S S N FAVGAAPH P FLH R Y YQ RQL S S T 12 0 

I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I II I I I I I 

Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

Qy 121 YRDLRKGVWPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I II I I I I I I II I II I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I 
Db 121 YRDLRKGVTVPYTQGKWEGELGTDLVSI PHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 181 GIAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I ! I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKS I KAAS S TEKFP DGFWLGEQLVCWQAGTT PWNI FPVI S LYLMGEVTNQS FRI T 360 

Qy 361 I LPQQYLRPVEDVATSQDDCYKFAI SQS STGTVMGAVIMEGFYWFDRARKRI GFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 I LPQQYLRPVEDVATSQDDCYKFAI SQS STGTVMGAVIMEGFYWFDRARKRI GFAVSAC 420 


Qy 

Db 


421 
421 


HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I 
HVHDEFRTAAVEGPFWLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 


480 
480 


Qy 4 81 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I I I I I I I I I I I I 
Db 481 RCLRCLRQQHDDFADDISLLK 501 


RESULT 3 

US-09-551-853D-4 

; Sequence 4, Application US/09551853D 

; Patent No. 6500667 

; GENERAL INFORMATION: 

; APPLICANT: GURNEY ET AL. 

; TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 
; FILE REFERENCE: 29915/6280L 

; CURRENT APPLICATION NUMBER: US/09/551, 853D 

; CURRENT FILING DATE: 2000-04-18 

; PRIOR APPLICATION NUMBER: US 60/155,493 

; PRIOR FILING DATE: 1999-09-23 

; PRIOR APPLICATION NUMBER: US 09/404,133 

; PRIOR FILING DATE: 1999-09-23 

; PRIOR APPLICATION NUMBER: PCT/US99/20881 

; PRIOR FILING DATE: 1999-09-23 

; PRIOR APPLICATION NUMBER: US 60/101,594 

; PRIOR FILING DATE: 1998-09-24 

; NUMBER OF SEQ ID NOS : 7 3 

SOFTWARE: Patentln version 3.1 
; SEQ ID NO 4 

LENGTH: 5 01 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-551-853D-4 

Query Match 99.7%; Score 2656; DB 4; Length 501; 

Best Local Similarity 99.8%; Pred. No. 7.4e-267; 

Matches 500; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 

Db 1 

Qy 61 

Db 61 

Qy 121 

Db 121 

Qy 181 

Db 181 

Qy 241 

Db 241 


MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

YRDLRKGVTVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 18 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I M I I I I I I 
YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I i I I I I I I I I I I I I I I II I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 240 

DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I 
DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 


Qy 


301 


VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWN I FPVI S L YLMGEVTNQS FRI T 360 



Db 


301 


VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 


Qy 


361 


ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

I I I II I I I I I I ! I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I 

ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 


Db 


361 


Qy 


421 


HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 4 80 


Db 


421 


Qy 


481 


RCLRCLRQQHDDFADDI SLLK 501 



Db 


481 


RCLRCLRQQHDDFADDI SLLK 501 


RESULT 4 
US-09-009-191-2 

; Sequence 2, Application US/09009191 
; Patent No. 6319689 

GENERAL INFORMATION: 

APPLICANT: POWELL, DAVID 
APPLICANT: CHAPMAN, CONRAD 
APPLICANT: MURPHY, KAY 
; APPLICANT: SMITH, TRUDI 
; TITLE OF INVENTION: ASP2 

; NUMBER OF SEQUENCES: 6 

; CORRESPONDENCE ADDRESS: 

ADDRESSEE: RATNER & PRESTIA 
STREET: P.O. BOX 980 
CITY: VALLEY FORGE 
STATE: PA 
COUNTRY: USA 
; ZIP: 19482 

COMPUTER READABLE FORM: 
; MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/009,191 
FILING DATE: 20-JAN-1998 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: UK 9701684.4 
; FILING DATE: 28-JAN-1997 

ATTORNEY/AGENT INFORMATION: 
; NAME: PRESTIA, PAUL F 

; REGISTRATION NUMBER: 23,031 

; REFERENCE/ DOCKET NUMBER: GH-70368 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: 610-407-0700 
TELEFAX: 610-407-0701 
TELEX: 846169 
INFORMATION FOR SEQ ID NO: 2: 


SEQUENCE CHARACTERISTICS: 
LENGTH: 5 01 amino acids 
TYPE: amino acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-09-009-191-2 

Query Match 99.5%; Score 2650; DB 4; Length 501; 

Best Local Similarity 99.6%; Pred. No. 3.1e-266; 

Matches 499; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I | | | 
Db 61 VEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPHP FLHRYYQRQLS ST 12 0 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIi^AITESDKFFINGSNWEGIL 180 

I M I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 121 YRDLRKGVYEPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 18 0 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I II I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

M I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II ! I I I I I I I I I I I I I I 
Db 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVIiDEFRTAAVEGPFWLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I M I I I I I I M I I I I I I I I I I I I I I I | | | | | | | | | | | | || | | | | | | | | | | | | M 
Db 421 HVT1DEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 4 81 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I I I I I I I I I I I I 
Db 481 RCLRCLRQQHDDFADDISLLK 501 


RESULT 5 
US-09-604-608-2 

; Sequence 2, Application US/09604608 

; Patent No. 6545127 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Jordan J.N. 

; APPLICANT: Lin, Xinli 

; APPLICANT: Koelsch, Gerald 


TITLE OF INVENTION: Catalytically Active Recombinant Memapsin and Methods 
TITLE OF INVENTION: of Use Thereof 
FILE REFERENCE: OMRF 179 

CURRENT APPLICATION NUMBER: US/09/604 , 608 
CURRENT FILING DATE: 2000-06-27 


PRIOR APPLICATION NUMBER 


PRIOR FILING DATE: 


1999 


PRIOR APPLICATION NUMBER 


PRIOR FILING DATE: 


1999 


PRIOR APPLICATION NUMBER 


PRIOR FILING DATE: 


2000 


PRIOR APPLICATION NUMBER 


PRIOR FILING DATE: 


2000 


PRIOR APPLICATION NUMBER 


60/141,363 
06-28 

60/168, 060 
11-30 

60/177,836 
01-25 

60/178,368 
01-27 

60/210,292 


PRIOR FILING DATE: 2000-06-08 
NUMBER OF SEQ ID NOS : 31 
SOFTWARE: PatentlnVer. 2.1 
SEQ ID NO 2 

LENGTH: 4 88 

TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE : 

OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 


Purified Memapsin 2 

Amino Acids 28-48 are remnant putative propeptide 
residues 

Amino Acids 58-61, 78, 80, 82-83, 116, 118-121, 
156, 166, 174, 246, 274, 276, 278-281, 283, and 
376-377 are residues in contact with the OM99-2 
inhibitor 

OTHER INFORMATION: Amino acids 54-57, 61-68, 73-80, 86-89, 109-111, 

113-118, 123-134, 143-154, 165-168, 198-202, and 
220-224 are N-lobe Beta Strands 
OTHER INFORMATION: Amino Acids 184-191 and 210-217 are N-lobe Helices 
OTHER INFORMATION: Amino acids 237-240, 247-249, 251-256, 259-260, 

273-275, 282-285, 316-318, 331-336, 342-348, 
354-357, 366-370, 372-375, 380-383, 390-395, 
400-405, and 418-420 are C-lobe Beta Strands 
Amino Acids 286-299, 307-310, 350-353, 384-387, 
and 427-431 are C-lobe Helices 


OTHER INFORMATION 
OTHER INFORMATION 


OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
US-09-604-608-2 


Query Match 96.9%; Score 2582; DB 4; Length 488; 

Best Local Similarity 99.8%; Pred. No. 3.4e-259; 

Matches 487; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

QY 14 AGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQ 73 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
Db 1 AGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQ 60 

Qy 74 GYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVYVPYT 133 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I | | | | | | | | | 

Db 61 GYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVYVPYT 120 

Qy 134 QGKWEGELGTDLVSIPHGPNVTVRANIT^AITESDKFFINGSNWEGILGLAYAEIARPDDS 193 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I | | | || | | | | 
Db 121 QGKWEGELGTDLVSIPHGPNWVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDS 180 


Qy 194 LEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTP 253 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | If I I I I I 
Db 181 LEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTP 240 

Qy 254 IRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAAS 313 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
°b 241 I RREWYYEVI I VRVEINGQDLKMDCKEYN YDKS I VDSGTTNLRLPKKVFEAAVKS I KAAS 300 

Qy 314 STEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDV 373 

M I I I I I I I I I I I I I I I I I I I I I I I I I | M I I I I I I I I I I I I I I I I I I I M I | I I I I I I I 
Db 301 STEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDV 360 

Qy 374 ATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHVHDEFRTAAVEG 433 

I I I I I I I I M I I M I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | M I I 
Db 361 ATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHVHDEFRTTyWEG 42 0 

Qy 434 PFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQWRCLRCLRQQHDDF 493 

I I I I I I I II I I I I II I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I II I I I I I I I I 
Db 421 PFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQWRCLRCLRQQHDDF 4 80 

Qy 494 ADDISLLK 501 

I I I II I I I 
Db 481 ADDISLLK 488 


RESULT 6 
US-09-713-158-2 

Sequence 2, Application US/09713158 
Patent No. 6361975 
GENERAL INFORMATION: 
APPLICANT: ZHU, YUAN 
APPLICANT: LI, XIAOTONG 
APPLICANT: POWELL, DAVID J. 
APPLICANT: CHRISTIE, GARY 

TITLE OF INVENTION: MOUSE ASPARTIC SECRETASE-2 (MASP-2) 
FILE REFERENCE: GP-70660 

CURRENT APPLICATION NUMBER: US/09/713, 158 
CURRENT FILING DATE: 2000-11-15 
PRIOR APPLICATION NUMBER: 60/165,800 
PRIOR FILING DATE: 1999-11-16 
NUMBER OF SEQ ID NOS : 2 

SOFTWARE: FastSEQ for Windows Version 3.0 
SEQ ID NO 2 
LENGTH: 501 
TYPE: PRT 

ORGANISM: MUS MUSCULUS 
US-09-713-158-2 


Query Match 96.9%; Score 2582; DB 4; Length 501; 

Best Local Similarity 96.6%; Precl. No. 3.6e-259; 

Matches 484; Conservative 7; Mismatches 10; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLP7\HGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I M I I I I : I : I : I I I I I II I II I M M I II I I II M I I I I I I I I I I I I I I 
Db 1 MAQALPWLLLWVGSGMLPAQGTHLGIRLPLRSGLAGPPLGLRLPRETDEESEEPGRRGSF 60 


Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 


61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

121 YRDLRKGVWPYTQGKWEGELGTDLVSIPHGPNVTVRANI7VAITESDKFFINGSNWEGIL 180 

M I I I II I I I I I I I I I I I I I I I I I | | | M I I I I I I I I I I I II i I I I I I I I I I I I I I I I I I 
121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

181 GLAYAEI7VRPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I M I I I I I I II I I I I I I I I I I I I : | | : | | | | M I I I I I I I : I I I I I I I I I I I I I I 
181 GLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTEALASVGGSMIIGGI 240 

241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

N I I I I I I I I II I I I I I I I I I I I I I I I I I I I | | | | I I I I I I II I I I I I I 

241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I I I I I I I I I I I I M I I I II I I 1 I M I I I I I I I I I I I I I II II I I I I I I I I I I I 

301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I M II I I I I : I II I I I I I I I I I I I I I I I II I I I I MINI 

361 ILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

421 HVH D E FRTAAVE G P FVT ADME D C G YN IPQTDEST LMT I AYVMAAI C AL FML PLC LMVC QW 480 

481 RCLRCLRQQHDDFADDI S LLK 501 

I I I I I I I I I I I I I I I I I I I I 
481 RCLRCLRHQHDDFADDI SLLK 501 


RESULT 7 
US-09-604-608-3 

; Sequence 3, Application US/09604608 

; Patent No. 6545127 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Jordan J.N. 

; APPLICANT: Lin, Xinli 

; APPLICANT: Koelsch, Gerald 

; TITLE OF INVENTION: Catalytically Active Recombinant Memapsin and Methods 
; TITLE OF INVENTION: of Use Thereof 

FILE REFERENCE: OMRF 17 9 
; CURRENT APPLICATION NUMBER: US/09/604 , 608 
; CURRENT FILING DATE: 2000-06-27 
; PRIOR APPLICATION NUMBER: 60/141,363 

PRIOR FILING DATE: 1999-06-28 
; PRIOR APPLICATION NUMBER: 60/168,060 

PRIOR FILING DATE: 1999-11-30 
; PRIOR APPLICATION NUMBER: 60/177,836 
; PRIOR FILING DATE: 2000-01-25 

PRIOR APPLICATION NUMBER: 60/178,368 

PRIOR FILING DATE: 2000-01-27 
; PRIOR APPLICATION NUMBER: 60/210,292 
; PRIOR FILING DATE: 2000-06-08 
; NUMBER OF SEQ ID NOS : 31 

SOFTWARE: Patentln Ver. 2.1 


Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 


us 


SEQ ID NO 3 
LENGTH: 5 03 
TYPE: PRT 
ORGANISM: Homo 
FEATURE : 

OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
-09-604-608-3 


sapiens 


Pro-memapsin 2 

Amino Acids 1-15 are vector-derived residues 
Amino Acids 16-64 are a putative pro peptide 
Amino Acids 1-13 are the T7 promoter 
Amino Acids 16-456 are Pro-memapsin 2-T1 
Amino Acids 16-421 are Promemapsin 2-T2 


Query Match 96.9%; Score 2582; DB 4; Length 503; 

Best Local Similarity 99.8%; Pred. No. 3.6e-259; 

Matches 487; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

QY 14 AGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQ 73 

I I I I I I I I I I I M I I I I I I I I I > I I M I I I I I I I II I I I | | M | | | | | M I I I I 1 I I | | | 
Db 16 AGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQ 75 

QY 74 GYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVYVPYT 133 

I I I I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I | | | || | | | | | | | | | | M I I I I I I 
Db 76 GYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVYVPYT 135 

QY 134 QGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDS 193 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I | | | | | | | | | | | | | | | | | | 
Db 136 QGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDS 195 


QY 194 LE P FFD S LVKQTHVPNL FS LHLCGAG FP LNQ S EVLAS VGGSMI I GG I DH S L YT G S LW YT P 253 

I M I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | || | | | | M I I II I I I I I I I I I M | | 
Db 196 LEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTP 255 

QY 254 IRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAAS 313 

I I I I I I I I M I I I I I I I I I I I I | | | | | | | | i | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 
Db 256 IRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAAS 315 

QY 314 STEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDV 373 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II | | | | | | | || | | 
Db 316 STEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDV 375 

QY 374 ATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHVHDEFRTAAVEG 433 

I I I I I I I I I I I I I I I I I M I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I 

Db 376 ATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHVHDEFRTAAVEG 435 

QY 434 PFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQWRCLRCLRQQHDDF 4 93 

I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I || | 
Db 436 PFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQWRCLRCLRQQHDDF 495 

Qy 494 ADDISLLK 501 

I I I I I I I I 
Db 496 ADDISLLK 503 


RESULT 8 

US-09-548-372D-8 

; Sequence 8, Application US/09548372D 


; Patent No. 6420534 

; GENERAL INFORMATION: 

; APPLICANT: GURNEY ET AL. 

; TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 
; FILE REFERENCE: 29915/62801 

; CURRENT APPLICATION NUMBER: 03/09/548,3720 
; CURRENT FILING DATE: 2000-04-12 
; PRIOR APPLICATION NUMBER: US 60/155,493 
; PRIOR FILING DATE: 1999-09-23 

PRIOR APPLICATION NUMBER: US 09/404,133 
; PRIOR FILING DATE: 1999-09-23 

PRIOR APPLICATION NUMBER: PCT/US99/20881 

PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: US 60/101,594 

PRIOR FILING DATE: 1998-09-24 
; NUMBER OF SEQ ID NOS : 73 

SOFTWARE: Patent In version 3.1 
; SEQ ID NO 8 

LENGTH: 501 
TYPE: PRT 

ORGANISM: Mus mus cuius 
US-09-548-372D-8 

Query Match 96.4%; Score 2567; DB 4; Length 501; 

Best Local Similarity 96.2%; Pred. No. 1.3e-257; 

Matches 4 82; Conservative 7; Mismatches 12; Indels 0; Gaps 0; 

1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

If II I I I I I : I : I : I I I II I I I I I I I I I I I I I I I I I I I I I I I I MINIMI 
1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLAGPPLGLRLPRETDEESEEPGRRGSF 60 

61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I I I M I I I I I I II I I I | | | | | M I I I M M I I I I M I I I I I I I 

61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

121 YRDLRKGVTVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I M I I I I I I I I I | | | | | M I I I I I I I I I I I I I I I I I I | II I I I I I I I 
121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

I M I I I I I I I I I I I I I I M I I I I I I I : II : I I I 1111111111:1 1111111111111 
181 GLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTEALASVGGSMIIGGI 240 

241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I I I | | | 
241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWN I FPVI S LYLMGEVTNQS FRI T 360 

N I I I I I I I I I I I I I I I I I I I I I I I II I | | | | | | | | | | | | | | | | | | | | | | M I I I I I I I | 
301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWN I FPVI S LYLMGEVTNQS FRIT 360 

361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I M I I I I I I I : I I I I I I I I I I I I II I I I I I I I I I I I I I I M I II I I 
361 ILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 


Qy 
Db 

QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 


QV 421 HVBDEFRTAAVEGPFWLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I' I I I I I I I I I I I I I I M I I I I I I I I I I I I I I | | M I I I I I I I I I I I I 

Db 421 HVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

I M I I I I I I I I I I I I I I I I I 
Db 481 RCLRCLRHQHDDFADDISLLK 501 


RESULT 9 

US-09-548-367D-8 

; Sequence 8, Application US/09548367D 

; Patent No. 6440698 

; GENERAL INFORMATION: 

; APPLICANT: GURNEY ET AL. 

; TITLE OF INVENTION: ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 
; FILE REFERENCE: 29915/6280H 

; CURRENT APPLICATION NUMBER: US/ 09/54 8 , 367D 

; CURRENT FILING DATE: 2000-04-12 

; PRIOR APPLICATION NUMBER: US 60/155,493 

; PRIOR FILING DATE: 1999-09-23 

; PRIOR APPLICATION NUMBER: US 09/404,133 

PRIOR FILING DATE: 1999-09-23 

PRIOR APPLICATION NUMBER: PCT/US99/20881 
; PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: US 60/101,594 

PRIOR FILING DATE: 1998-09-24 
; NUMBER OF SEQ ID NOS : 73 

SOFTWARE: Patent In version 3.1 
; SEQ ID NO 8 

LENGTH: 501 
TYPE: PRT 

ORGANISM: Mus mus cuius 
US-09-548-367D-8 


Query Match 96.4%; Score 2567; DB 4; Length 501; 

Best Local Similarity 96.2%; Pred. No. 1.3e-257; 

Matches 482; Conservative 7; Mismatches 12; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

N II I I I I I : I : I : I I I i I I I I I I I I I I I I I I I I I I I I j j I ! I MINIMI 
Db 1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLAGPPLGLRLPRETDEESEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I M I M I I I I M M I I I I I I I I I I I I M I I I I I II I I I I M I I I I I I I I I II II I I 

Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I M I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

QY 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I i I M I I I I : I I : I I I I II I I I I || | : | I I I I I I I I I I I I I 
Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTEALASVGGSMIIGGI 240 


Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 


QY 


Db 


301 VFEAAVKS I KAAS ST EKFPDGFWLGEQLVCWQAGTT PWNI FPVI S L YLMGEVTNQS FRI T 360 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
301 VFEAAWS I KAAS ST EKFPDGFWLGEQLVCWQAGTT PWNI FPVI SLYLMGEVTNQS FRIT 360 


Qy 


361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 



Db 


361 ILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 


Qy 


421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 




Db 


421 HVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 


QY 


Db 


4 81 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I I I I I I I I I I I 
481 RCLRCLRHQHDDFADDISLLK 501 


RESULT 10 
US-09-551-853D-8 

; Sequence 8, Application US/09551853D 

; Patent No. 6500667 

; GENERAL INFORMATION: 

; APPLICANT: GURNEY ET AL . 

; TITLE OF INVENTION: ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 
; FILE REFERENCE: 29915/6280L 

; CURRENT APPLICATION NUMBER: US/09/551, 853D 

; CURRENT FILING DATE: 2000-04-18 

; PRIOR APPLICATION NUMBER: US 60/155,493 

; PRIOR FILING DATE: 1999-09-23 

; PRIOR APPLICATION NUMBER: US 09/404,133 

; PRIOR FILING DATE: 1999-09-23 

; PRIOR APPLICATION NUMBER: PCT/US99/20881 

; PRIOR FILING DATE: 1999-09-23 

; PRIOR APPLICATION NUMBER: US 60/101,594 

; PRIOR FILING DATE: 1998-09-24 

; NUMBER OF SEQ ID NOS : 73 

; SOFTWARE: Patent In version 3.1 

; SEQ ID NO 8 

LENGTH: 501 
; TYPE: PRT 

; ORGANISM: Mus mus cuius 
US-09-551-853D-8 

Query Match 96.4%; Score 2567; DB 4; Length 501; 

Best Local Similarity 96.2%; Pred. No. 1.3e-257; 

Matches 4 82; Conservative 7; Mismatches 12; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

II II I I I I I : I : I : I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLAGPPLGLRLPRETDEESEEPGRRGSF 60 


Qy 

Db 


61 
61 


VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I I 
VEMVDNLRGKSGQGYYVEMTVGS P PQTLNI LVDTGS SNFAVGAAPHPFLHRYYQRQLS ST 120 


Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I II I I I I I I I I I I I I : I I : I I I I I I I I I I I I I : I I I I I I I I I I II II 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTEALASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I I ! I I I I 
Db 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFWLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I I I I I I I I I I I I I I I I I I 

Db 481 RCLRCLRHQHDDFADDISLLK 501 


RESULT 11 

US-09-548-372D-6 

; Sequence 6, Application US/09548372D 

; Patent No. 6420534 

; GENERAL INFORMATION: 

; APPLICANT: GURNEY ET AL. 

; TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 

FILE REFERENCE: 29915/62801 
; CURRENT APPLICATION NUMBER: US/09/548 , 372D 
; CURRENT FILING DATE: 2000-04-12 
; PRIOR APPLICATION NUMBER: US 60/155,493 
; PRIOR FILING DATE: 1999-09-23 

PRIOR APPLICATION NUMBER: US 09/404,133 

PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: PCT/US99/20881 
; PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: US 60/101,594 
; PRIOR FILING DATE: 1998-09-24 
; NUMBER OF SEQ ID NOS : 73 
; SOFTWARE: Patentln version 3.1 
; SEQ ID NO 6 

LENGTH: 47 6 


; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-09-548-372D-6 


Query Match 94.1%; Score 2506.5; DB 4; Length 476; 

Best Local Similarity 95.0%; Pred. No. 2.3e-251; 

Matches 476; Conservative 0; Mismatches 0; Indels 25; Gaps 1; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDN L RG K S GQ G Y YVEMT VG S P P QT LN I L VDT G S S N FAVGAAP H P FL H R Y YQ RQ L S S T 120 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

Qy 121 YRDLRKGVTVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 121 YRDLRKGVTVPYTQGKWEGELGTDLVSIPHGPNVTVR7\NIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 GLAYAEIAR LCGAGFPLNQSEVLASVGGSMI IGGI 215 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 216 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 275 

Qy 301 VFELAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 276 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 335 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 336 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 395 

Qy 421 HWDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 480 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 396 HVliDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYWIAAICALFMLPLCLMVCQW 4 55 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I I I I I I I I I M i I I I I I I I 
Db 456 RCLRCLRQQHDDFADDI SLLK 47 6 


RESULT 12 
US-09-548-367D-6 

; Sequence 6, Application US/09548367D 

; Patent No. 6440698 

; GENERAL INFORMATION : 

; APPLICANT: GURNEY ET AL. 

; TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 
; FILE REFERENCE: 29915/6280H 

; CURRENT APPLICATION NUMBER: US/09/548, 367D 
; CURRENT FILING DATE: 2000-04-12 


; PRIOR APPLICATION NUMBER: US 60/155,493 

; PRIOR FILING DATE: 1999-09-23 

; PRIOR APPLICATION NUMBER: US 09/404,133 

; PRIOR FILING DATE: 1999-09-23 

; PRIOR APPLICATION NUMBER: PCT/US99/2 08 8 1 

; PRIOR FILING DATE: 1999-09-23 

; PRIOR APPLICATION NUMBER: US 60/101,594 

; PRIOR FILING DATE: 1998-09-24 

; NUMBER OF SEQ ID NOS : 7 3 

SOFTWARE : Patentln version 3.1 
; SEQ ID NO 6 

LENGTH: 476 
TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-548-367D-6 

Query Match 94.1%; Score 2506.5; DB 4; Length 476; 

Best Local Similarity 95.0%; Pred. No. 2.3e-251; 

Matches 476; Conservative 0; Mismatches 0; Indels 25; Gaps 1; 


Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I ! I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 

Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 18 0 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

| | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 GLAYAEIAR LCGAGFPLNQSEVLASVGGSMIIGGI 215 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I II I I I I I I I I I I I I M I I I I I I 

Db 216 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 275 

Qy 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I | | | | | | II I I I M I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 276 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI S LYLMGEVTNQS FRIT 335 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I | | | | I I I I I I II I I I II I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I I I I 

Db 336 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 395 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 


396 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 455 


Qy 

Db 


4 81 RCLRCLRQQHDDFADDI S LLK 501 

I I I I I I I I I I I I I I I I I I I I I 
456 RCLRCLRQQHDDFADDI SLLK 476 


RESULT 13 
US-09-551-853D-6 

; Sequence 6, Application US/09551853D 

; Patent No. 6500667 

; GENERAL INFORMATION: 

; APPLICANT: GURNEY ET AL. 

; TITLE OF INVENTION: ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 

FILE REFERENCE: 29915/6280L 
; CURRENT APPLICATION NUMBER: US/09/551, 853D 
; CURRENT FILING DATE: 2000-04-18 

PRIOR APPLICATION NUMBER: US 60/155,493 
; PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: US 09/404,133 

PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: PCT/US99/208 81 
; PRIOR FILING DATE: 1999-09-23 

PRIOR APPLICATION NUMBER: US 60/101,594 

PRIOR FILING DATE: 1998-09-24 
; NUMBER OF SEQ ID NOS : 73 

SOFTWARE: Patentln version 3.1 
; SEQ ID NO 6 

LENGTH: 47 6 
TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-551-853D-6 

Query Match 94.1%; Score 2506.5; DB 4; Length 476; 

Best Local Similarity 95.0%; Pred. No. 2.3e-251; 

Matches 476; Conservative 0; Mismatches 0; Indels 25; Gaps 1; 
Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 



Db 


1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 


Qy 


Db 


61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 1 
61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 


Qy 


121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 



Db 


121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 


Qy 


Db 


181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

181 GLAYAEIAR LCGAGFPLNQSEVLASVGGSMIIGGI 215 


Qy 


241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 



Db 


216 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 275 


Qy 


Db 


301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQSFRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

276 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQSFRIT 335 


Qy 


361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 


1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 336 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 395 

Qy 421 HVHDEFRTAAVTIGPFVTLDMEDCGYNIPQTDEST 480 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 
Db 396 HVHDEFRTAAVEGPFWLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 455 

Qy 4 81 RCLRCLRQQHDDFADDI SLLK 501 

I I I I I I I I I I I I I I I I I I I I I 

Db 456 RCLRCLRQQHDDFADDI SLLK 476 


RESULT 14 
US-09-548-372D-73 

; Sequence 73, Application US/09548372D 

; Patent No. 6420534 

; GENERAL INFORMATION: 

; APPLICANT: GURNEY ET AL. 

; TITLE OF INVENTION: ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 
; FILE REFERENCE: 29915/62801 

; CURRENT APPLICATION NUMBER: US/ 09/ 548 , 372D 
; CURRENT FILING DATE: 2000-04-12 

PRIOR APPLICATION NUMBER: US 60/155,493 
; PRIOR FILING DATE : 1999-09-23 
; PRIOR APPLICATION NUMBER: US 09/404,133 

PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: PCT/US99/20881 

PRIOR FILING DATE: 1999-09-23 

PRIOR APPLICATION NUMBER: US 60/101,594 
; PRIOR FILING DATE: 1998-09-24 
; NUMBER OF SEQ ID NOS : 73 
; SOFTWARE: Patentln version 3.1 
; SEQ ID NO 73 
; LENGTH: 47 6 
; TYPE: PRT 

; ORGANISM: Mus mus cuius 
US-09-548-372D-73 


Query Match 90.9%; Score 2420.5; DB 4; Length 476; 

Best Local Similarity 91.8%; Pred. No. 1.9e~242; 

Matches 4 60; Conservative 5; Mismatches 11; Indels 25; Gaps 1; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

II II I I I I I : I : I : I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLAGPPLGLRLPRETDEESEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

Qy 121 YRDLRKGVWPYTQGKWEGELGTDLVSIPHGPNWVRT^IIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 YRDLRKGVTVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 


Qy 


181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 


I I I I I I I I I I II I II II II • I II II M I M M I I 

Db 181 GLAYAEIAR LCGAGFPLNQTEALASVGGSMI I GGI 215 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 216 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 275 

Qy 301 VFE7\AVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 276 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI S L YLMGEVTNQS FRI T 335 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 336 ILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 395 

Qy 421 HVHDEFRTAAVEGPFOTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 396 HVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 455 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I I I I I I I I I I I 
Db 456 RCLRCLRHQHDDFADDI SLLK 47 6 


RESULT 15 
US-09-548-367D-73 

; Sequence 73, Application US/09548367D 

; Patent No. 6440698 

; GENERAL INFORMATION: 

; APPLICANT: GURNEY ET AL. 

; TITLE OF INVENTION: ALZHEIMER 1 S DISEASE S EC RET AS E, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 

FILE REFERENCE: 29915/6280H 
; CURRENT APPLICATION NUMBER: US/09/548, 367D 
; CURRENT FILING DATE: 2000-04-12 

PRIOR APPLICATION NUMBER: US 60/155,493 

PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: US 09/404,133 

PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: PCT/US99/20881 
; PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: US 60/101,594 
; PRIOR FILING DATE: 1998-09-24 
; NUMBER OF SEQ ID NOS : 73 

SOFTWARE: Patentln version 3.1 
; SEQ ID NO 73 
LENGTH: 47 6 
TYPE: PRT 
; ORGANISM: Mus mus cuius 
US-09-548-367D-73 

Query Match 90.9%; Score 2420.5; DB 4; Length 476; 

Best Local Similarity 91.8%; Pred. No. 1.9e-242; 

Matches 4 60; Conservative 5; Mismatches 11; Indels 25; Gaps 1; 


Qy 


1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 


II II I I I I I : I : I : I I I II I II I I I I I I I I I I I II I I I M I I I I I I I I I I I I 

Db 1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLAGPPLGLRLPRETDEESEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 61 VEMVDNLRGKSGQGYWEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I I : I II I I I I I I I I I II 

Db 181 GLAYAEIAR LCGAGFPLNQTEALAS VGGSMI I GGI 215 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I M I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I M I I I I | | I | | | | | 
Db 216 DHS LYTGS LWYTPI RREWYYEVI I VRVEINGQDLKMDCKEYNYDKS I VDSGTTNLRLPKK 275 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI SLYLMGEVTNQS FRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 276 VFEAAVKSI KAASSTEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 335 

Qy 361 I LPQQYLRPVEDVATSQDDCYKFAI SQS STGTVMGAVIMEGFYVVFDRARKRI GFAVSAC 420 

I I I I I I I I I I I I I I I I I I II I I I I : I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 336 I LPQQYLRPVEDVATSQDDCYKFAVSQS STGTVMGAVIMEGFYVVFDRARKRI GFAVSAC 395 

Qy 421 HVHDEFRTAAVEGPFWLDMEDCGYNIPQTDESTLMTIAYVMAAICTUjFMLPLCLMVCQW 480 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I 

Db 396 HVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 455 

Qy 4 81 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I I I I I I I I I I I 
Db 4 56 RCLRCLRHQHDDFADDISLLK 47 6 


Search completed: January 21, 2004, 09:27:07 
Job time : 46.0229 sees 


GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 


OM protein - protein search, using sw model 

Run on: January 21, 2004, 09:16:55 ; Search time 45.9809 Seconds 

(without alignments) 
1047.838 Million cell updates/sec 


Title: 

Perfect score: 
Sequence : 


US-09-869-414A-4 
2664 

1 MAQ AL P W L L L WMGAGVL PAH CLRCLRQQHDDFADDISLLK 501 


Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 283308 seqs, 96168682 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 


283308 


Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 


Database 


PIR_76:* 
pirl : * 
pir2 : * 
pir3 : * 
pir4 : * 


Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 


RESULT 1 
A59090 

aspartic proteinase (EC 3.4.23.-) BACE precursor - human 
N;Alternate names: beta-secretase ; beta-site APP cleaving enzyme 
C; Species: Homo sapiens (man) 

C;Date: 29-Oct-1999 #sequence_revision 29-Oct-1999 #text_change ll-May-2000 
C; Accession: A59090 

R;Vassar, R. ; Bennett, B.D.; Babu-Khan, S.; Kahn, S.; Mendiaz, E.A. ; Denis, P 
Teplow, D.B.; Ross, S.; Amarante, P.; Loeloff, R. ; Luo, Y. ; Fisher, S.; Fulle 
J.; Edenson, S.; Lile, J.; Jarosinski, M.A. ; Biere, A.L.; Curran, E. ; Burgess 
T . ; Louis, J.C.; Collins, F. ; Treanor, J.; Rogers, G. ; Citron, M. 
Science 286, 735-741, 1999 

A;Title: beta-Secretase cleavage of Alzheimer ! s amyloid precursor protein by 
transmembrane aspartic protease BACE. 

A; Reference number: A59090; MUID: 20002972 ; PMID : 10531052 
A; Note: submitted to GenBank, September 1999 
A;Accession: A59090 

A; Status: not compared with conceptual translation 
A; Molecule type: mRNA 
A; Residues: 1-501 <VAS> 


A;Cross-references: GB:AF190725; NID : g6118538 ; PIDN : AAF04142 . 1 ; PID:g6118539 
C; Genetics : 
A; Gene : BACE 

C; Super family : beta-secretase 

C; Keywords: Alzheimer's disease; aspartic proteinase; brain; glycoprotein; 

hydrolase; protein digestion; transmembrane protein; zymogen 

F; 1-21/Domain: signal sequence (fstatus predicted <SIG> 

F; 22-45/Domain : propeptide jfstatus predicted <PRO> 

F; 4 6-501/ Product : acid proteinase BACE #status predicted <MAT> 

F; 4 61-4 7 7 /Domain : transmembrane ((status predicted <TRN> 

F; 93, 289/Active site: Asp #status predicted 

F;153,172,223,354/Binding site: carbohydrate (Asn) (covalent) #status predicted 
F;330-380/Disulfide bonds: #status predicted 

Query Match 99.7%; Score 2656; DB 2; Length 501; 

Best Local Similarity 99.8%; Pred. No. 6.9e-206; 

Matches 500; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 
I M I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I I I 
MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

VEMVDNLRGKSGQGYYVEMTVGS P PQTLNI LVDTGSSNFAVGAAPHPFLHRYYQRQLS ST 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I E I I I I I I I I I I I I I I I I I I I M I I I I I 
VEMVDN L RG K S GQ G Y YVEMT VG S P P QT LN I L VDT G S S N FAVGAAP H P FLH R Y YQ RQ L S S T 120 

YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 18 0 

M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I M I I I I I I I I I I I I I I I II I M I I I I I I I I I I I I I I I I I I I M I I I M I 

GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 240 

DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I 1 I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI S L YLMGEVTNQS FRI T 360 
I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 

vfeaavt<sikaasstekfpdg™lgeqlvcwqagttpwnifpvislylmgevtnqsfrit 360 
ilpqqylrpvedvatsqddcykfaisqsstgtvmgavimegfywfdrarkrigfavsac 420 

I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

I LPQQYLRPVEDVATSQDDCYKFAI SQS STGTVMGAVIMEGFYWFDRARKRI GFAVSAC 420 

HVlIDEFRTAAV^GPFvTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 480 

I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

H VH D E FRT AAVE G P FVT L DME D C G YN IPQTDEST LMT I AYVMAAI C AL FML PLC LMVCQW 480 

RCLRCLRQQHDDFADDI SLLK 501 

I I I I I I I I I I I I I I I I I I I I I 

RCLRCLRQQHDDFADDI SLLK 501 


yy 

1 

DD 

i 


61 


O J. 

Ov 

121 

Db 

121 

QY 

181 

Db 

181 

QY 

241 

Db 

241 

Qy 

301 

Db 

301 

Qy 

361 

Db 

361 

Qy 

421 

Db 

421 

Qy 

481 

Db 

481 


RESULT 2 
JC7574 


pepsinogen A - African clawed frog 

C; Species: Xenopus laevis (African clawed frog) 

C;Date: 30-Jun-2001 #sequence_revision 30-Jun-2001 #text_change 03-Aug-2001 
C;Accession: JC7574; PC7119 

R;Ikuzawa, M. ; Inokuchi, T . ; Kobayashi, K. ; Yasumasu, S. 
J. Biochem. 129, 147-153, 2001 

A;Title: Amphibian pepsinogens: Purification and characterization of Xenopus 

pepsinogens, and molecular cloning of Xenopus and bullfrog pepsinogens. 

A; Reference number: JC7573; MUID : 21064922 ; PMID: 11134969 

A; Contents: Stomach 

A;Accession: JC7574 

A;Molecule type: mRNA 

A; Residues: 1-384 <IKU> 

A; Cross-references : DDBJ: AB045380 

A; Accession: PC7119 

A;Molecule type: protein 

A;Residues: 16-35;57-76 <IK2> 

C;Comment: This protein is a zymogen for gastric aspartic proteinase, with 

pepsin-like activity. 

C; Genetics : 

A; Gene : PgA 

C; Superf amily: pepsin 

C; Keywords: stomach; zymogen 

Query Match 12.2%; Score 324; DB 2; Length 384; 

Best Local Similarity 25.5%; Pred. No. 3.1e-18; 

Matches 113; Conservative 73; Mismatches 158; Indels 100; Gaps 19; 

r.WMnAGVT.PAHGTOHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVE 62 


Qy 

8 

Db 

3 

Qy 

63 

Db 

48 

Qy 

110 

Db 

105 

Qy 

164 

Db 

162 

Qy 

222 

Db 

218 

Qy 

280 

Db 

269 

Qy 

339 

Db 

316 


I I I : I I I : : : : : I I I I | : | | : : : 

LLLLLGLWL SECWKVPLRKG ESFRNRPQRLGLLGDYLKKNPYN 47 


109 


I : I I I : : I I I I : I I 


I III:: I : I I I I I I : : I : : : : 

'QQSSTFQATNTPVSIQYGTGSMSGFLGYDTLQV GNIQISNQMFGL 161 


II: | : | : : | | | | M : II III:: I : I I I I I : : I 


217 


\SVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDL — KMDCK 279 

I : : M : I : I I : I I I : I : I I : : : : INN: I 
-QTGSYVLFGGVDNSYYSGSLNWVPLTAETYWQITLDSVSINGQVIACSQSC- 2 68 


::|||:||: : I I ::: II |: :: I I 

■QAIVDTGTSLMTGPSTPI-ANIQNYIGASQDSN GQYVTNCNNI SNMPT 315 


392 


I I I 


Qy 


393 VMGAVIMEGFYWFDRARKRIGFA 416 


Db 358 I LGDVFI RQYFTVFDRANN YVAI A 381 


RESULT 3 
JC7575 

pepsinogen A - bullfrog 

C; Species: Rana catesbeiana (bullfrog) 

C;Date: 30-Jun-2001 #sequence_revision 30-Jun-2001 #text_change 03-Aug-2001 
C;Accession: JC7575 

R;Ikuzawa, M. ; Inokuchi, T . ; Kobayashi, K. ; Yasumasu, S. 
J. Biochem. 129, 147-153, 2001 

A; Title: Amphibian pepsinogens: Purification and characterization of Xenopus 

pepsinogens, and molecular cloning of Xenopus and bullfrog pepsinogens. 

A; Reference number: JC7573; MUID : 21064 922 ; PMID : 11134969 

A; Contents: Stomach 

A; Accession: JC7575 

A;Molecule type: mRNA 

A; Residues: 1-385 <IKU> 

A;Cros5-references : DDBJ: AB045376 

C;Comment: This protein is a zymogen for gastric aspartic proteinase, with 

pepsin-like activity. 

C; Genetics : 

A; Gene : PgA 

C; Superf amily : pepsin 

C; Keywords: stomach; zymogen 

Query Match 11.8%; Score 313.5; DB 2; Length 385; 

Best Local Similarity 26.6%; Pred. No. 2.2e-17; 

Matches 117; Conservative 74; Mismatches 158; Indels 91; Gaps 20; 

LLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFV 61 

: I I I I I I : : : I I I I I I I I I : : 

ILLLFGLWLAECGV VKVSLRK GESLRARLNR LGLLGDYLKKHHYN 48 

EMVDNLRGKSGQ GYYVEMTVGS PPQTLNI LVDTGS SNFAVG AAPH 106 

: : I ||: I : : : : I : I I I : : : I I I I M I : : I 


Ml:: I : I I Mill: I : : I : : : I 


Qy 

8 

Db 

3 

Qy 

62 

Db 

49 

Qy 

107 

Db 

109 

Qy 

167 

Db 

166 

Qy 

225 

Db 

220 

Qy 

283 

Db 

270 

Qy 

343 


MINI: : I III:: I : I : I I I : : I 


I I : I I I I I : I : I : I I : :: : : I I I : I 

'GGVDTSYYTGNLNWVPLTAETYWQITVDSISIGGQVIACSGSC 269 


I I I : I I : I I : : I : : : | : : : I I : I 

lIVDTGTSLLAGP STPIANIOYYIGANQDSNGQYV INCNNISNMPTWF- 319 


I I I I : : : I : I 


Db 


320 


TINGVQYPLPASAYVRQSQQSCTSGFQAMNLPTSSGDLWILGD 362 


Qy 397 VIMEGFYWFDRARKRI GFA 416 

I : : I I I I I I I : I 

Db 363 VFI REYYWFDRANN YVAMA 382 


RESULT 4 
B38302 

pepsin (EC 3.4.23.-) II-l precursor - rabbit 

C; Species: Oryctolagus cuniculus (domestic rabbit) 

C;Date: 14-Jun-1991 #sequence_revision 20-Sep-1991 #text_change 23-Feb-1997 
C; Accession: B38302 

R;Kageyama, T.; Tanabe, K. ; Koiwai, O. 
J. Biol. Chem. 265, 17031-17038, 1990 

A; Title: Structure and development of rabbit pepsinogens. Stage-specific 
zymogens, nucleotide sequences of cDNAs, molecular evolution, and gene 
expression during development. 

A;Reference number: A38302; MUID : 91009127 ; PMID: 2129536 
A; Accession: B38302 
A; Status: preliminary 
A; Molecule type: mRNA 
A; Residues: 1-387 <KAG> 

A; Cross-references: GB:M59235; GB:J05638 
C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; hydrolase; phosphoprotein; protein digestion 

Query Match 11.6%; Score 309; DB 2; Length 387; 

Best Local Similarity 27.1%; Pred. No. 5.1e-17; 

Matches 98; Conservative 68; Mismatches 130; Indels 66; Gaps 15; 


Qy 

75 

YYVEMT VGS PPQTLNI LVDTGS SNFAVG AAPHPFLHRYYQRQLSSTYRDLRKGVYV 

1 : : : : 1 : II 1 : : M 1 1 1 1 1 : : III:: III:: : : : 
YFGTISIGTPPQEFTVIFDTGSSNLWVPSTYCSSLACFLHKRFNPDDSSTFQATSETLSI 

130 

Db 

75 

134 

Qy 

131 

PYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESD KFFINGSNWEGILGLAYAEI 

I | Mill: 1 : 1 : : : : : 1 : : : : 1 1 1 1 1 1 1 1 
T YGTGSMTGI LGYDTVKV GN1EDTNQIFGLSKTEPGITFLV — APFDGILGLAYPSI 

187 

Db 

135 

18 9 

Qy 

188 

ARPDDSLEP FFDSLVKQTHV- PNLFS LHLCGAGFPLNQS EVLASVGGSMI I GGI DHSLYT 

: I : |||:: : 1 : 1 1 1 :: 1 1 1 = = lilt 1 El 
SASDAT — PVFDNMWNEGLVSEDLFSVYLSSNG EKGSMVMFGGIDSSYYT 

246 

Db 

190 

237 

Qy 

247 

GS LWYT P I RREWY YEVI I VRVEI NGQDLKM — DCKEYNYDKSIVDSGTTNLRLPKKVFEA 
I I I : I : II:::: : 1 1 1 : : 1 : : : I I : I I : I 1 
GSLNWVPVSHEGYWQITMDSITINGETIACADSC QAWDTGTS LLAGPT S AI S K 

304 

Db 

238 

291 

Qy 

305 

AVKS I KAAS STEKFPDGFWLGEQLV-CWQAGTTPWNI FPVI S LYLMGEVTNQS FRI TI LP 

II:: 1 1 1 : : 1 : 1 : 1 1 1 

IQSYIGASKNL LGENIISCSAIDSLPDIVF TINN 

363 

Db 

292 

325 

Qy 

364 

QQYLRPVED-VATSQDDC YKFAISQSSTGT — VMGAVIMEGFYWFDRARKRI GFAV 

|| | : III : : 1 1 : : 1 1 : : : 1 1 1 1 1 : : 1 1 
VQYPLPASAYILKEDDDCLSGFDGMNLDTSYGELWILGDVFIRQYFTVFDRANNQVGLAA 

417 

Db 

326 

385 

Qy 

418 

SA 419 



: I 


Db 


386 AA 387 


RESULT 5 
JC7573 

pepsinogen C - African clawed frog 
N; Alternate names: progastricsin 

C; Species: Xenopus laevis (African clawed frog) 

C;Date: 30-Jun-2001 #sequence_revision 30-Jun-2001 #text_change 03-Aug-2001 
C;Accession: JC7573; PC7118 

R;Ikuzawa, M. ; Inokuchi, T . ; Kobayashi, K. ; Yasumasu, S. 
J. Biochem. 129, 147-153, 2001 

A;Title: Amphibian pepsinogens: Purification and characterization of Xenopus 

pepsinogens, and molecular cloning of Xenopus and bullfrog pepsinogens. 

A; Reference number: JC7573; MUID : 21064922 ; PMID : 11134969 

A; Contents: Stomach 

A; Accession: JC7573 

A;Molecule type: mRNA 

A; Residues: 1-383 <IKU> 

A;Cross-references : DDBJ :AB045379 

A; Accession: PC7118 

A;Molecule type: protein 

A; Residues: 17-68 <IK2> 

C ; Comment : This protein is a zymogen for gastric aspartic proteinase, with 

pepsin-like activity. 

C; Genetics : 

A; Gene: PgC 

C; Superfamily : pepsin 

C; Keywords: stomach; zymogen 

Query Match 11.5%; Score 307.5; DB 2; Length 383; 

Best Local Similarity 25.9%; Pred. No. 6.6e-17; 

Matches 112; Conservative 64; Mismatches 139; Indels 117; Gaps 19; 


Qy 

23 

QHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQGYYVEMTVG 
::||: II | : : : | : | III |:::| 
ENGIKAPL VD PAT K Y YNQ YAT AY EPLSN YMDMS YYGEISIG 

82 

Db 

34 

74 

QY 

83 

SPPQTLNILVDTGSSNFAVGA APHPFLHRYYQRQLSSTYRDLRKGVYVPYTQ 

: 1 1 1 : 1 1 1 1 1 1 1 1 : II : 1 1 1 1 = : : 1 

TPPQNFLVLFDTGSSNLWVASTYCQSQACTNHPL FNPSQSSTYSSNQQQFSLQYGT 

134 

Db 

75 

130 

QY 

135 

GKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSN WEGI LGLAYAEI AR 

1 1 II 1 hi II : ::h: hi ::IMIIII II 
GSLTGILGYDTVTI QNVAISQQEFGLSETEP GTN FVYAQ F D G I LGLAY P S I A- 

189 

Db 

131 

182 

Qy 

190 

PDDSLEPFFDSLVKQTHVPN-LFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGS 

: : : 1 : : 1 : 1 1 II : 1 1 : 1 1 : 1 : 1 1 1 
-VGGATTVMQGMMQQNLLNQPIFGFYLSG QS SQNGGEVAFGGVDQNYYTGQ 

248 

Db 

183 

232 

Qy 

249 

LWYT P I RREWYYEVI I VRVEINGQD LKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAA 

: : : I 1 : I I : : : 1 1 1 1 1 1 : : | | | : | | : | I : I I : 
IYWTPVTSETYWQIGIQGFSINGQATGWCSQGC QAI VDTGT S LLTAPQS VFS S L 

305 

Db 

233 

286 

Qy 

306 

VKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI — FPVI SLYLMG EVTN 

: : I I 1 1 : : 1 II 1 II M 1 
IQSIGAQQDQN GQYWSCS— NIQNLPTISFTI SGVSFPLPPSAYVLQ 

354 

Db 

287 

332 


Qy 355 QS FRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFD 407 

II I I I I : I : : : | | : : I I : I 

Db 333 QSSGYCTIGIMPTYLPSQNGQPL WILGDVFLREYYSVYD 371 

Qy 408 RARKRI GFAVS A 419 

: : I I I : I 

Db 372 LGNNQVGFATAA 383 


RESULT 6 
S19682 

pepsin A (EC 3.4.23.1) 4 precursor - Japanese macaque 
N; Alternate names: pepsinogen A isozyme 4 
C; Species: Macaca fuscata (Japanese macaque) 

C;Date: 22-Nov-1993 #sequence_revision 19-Oct-1995 #text_change 18-Jun-1999 
C;Accession: S19682; S16065 
R;Kageyama, T.; Tanabe, K. ; Koiwai, O. 
Eur. J. Biochem. 202, 205-215, 1991 

A; Title: Development-dependent expression of isozymogens of monkey pepsinogens 
and structural differences between them. 

A; Reference number: S19681; MUID : 92037645 ; PMID: 1935977 
A; Accession: S 19 682 
A; Molecule type: mRNA 
A; Residues: 1-388 <KAG> 

A; Cross-references: EMBL:X59753; NID:g38070; PIDN : CAA42425 . 1 ; PID:g38071 

A; Note: parts of sequence, including amino ends of pepsinogen and activation 

intermediates, confirmed by protein sequencing 

C; Comment: This is a minor component of pepsin at all post-partum stages. 
C; Comment: Although two-step activation is observerd, activation is 
predominantly a one-step process. 
C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; gastric juice; hydrolase; phosphoprotein; 
protein digestion; stomach 

F; 1-1 5 /Domain : signal sequence ftstatus predicted <SIG> 

F; 16-3 8 8 /Product : pepsinogen A 4 ftstatus experimental <PPT> 

F; 16-62/Domain: activation peptide ftstatus experimental <APT> 

F; 63-3 8 8/ Product : pepsin A 4 ftstatus experimental <ENZ> 

F; 38-39/ Cleavage site: Leu-Lys (pepsin) ((status experimental 

F; 62-63/Cleavage site: Leu-Ile (pepsin) ftstatus experimental 

F; 94, 277/Active site: Asp ftstatus predicted 

F; 107-112, 268-272, 311-344/Disulfide bonds: ftstatus predicted 

F; 130/Binding site: phosphate (Ser) (covalent) ftstatus predicted 

Query Match 11.5%; Score 307.5; DB 1; Length 388; 

Best Local Similarity 27.6%; Pred. No. 6.8e-17; 

Matches 108; Conservative 65; Mismatches 135; Indels 83; Gaps 17; 

Qy 44 PRETDEEPEEPGRRGS FVEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGA 103 

| | | : | | : : : : : I : : : I : I I : : I I I I I I I 

Db 60 PTLIDEQPLE NYLDV EYFGTI GI GT PAQNFTWFDTGS SNLWV — 102 

Qy 104 APHPFL HRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTV 156 

I : I : I I I I I I I I : I I Mill: : : 

Db 103 -PSVYCYSLACMDHNLFNPQDSSTYRATSKTVSITYGTGSMTGILGYDTVKV GGISD 158 


Qy 


157 RANIAAITESDK-FFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHV-PNLFSLH 214 


I I I : : : : I I I I I I I I : III:: I I : I I I : : 

Db 159 TNQI FGLSETEPGFFLYFAPFDGI LGLAYPSI S — SS GAT PVFDN I WNQRLVSQDLFSVY 216 

Qy 215 LCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDL 274 

I : I I I : I I I I I I I I I I I : I : II:::: : : I I : : 

Db 217 LSAD DQS GSWI FGGI DS S YYTGS LNWVPVS VEGYWQ I S VDS I TMNGKT I 266 

Qy 275 — KMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLV-CW 331 

I : : I I I : I I : I I II:::: I I : I I 

Db 267 ACAKGC QAIVDTGTSLLTGPTSPIANIQSDIGASENSD GEMWSCS 312 

Qy 332 QAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQY-LRPVEDVATSQDDCYK FAI 385 

: I : I 111111:111 
Db 313 AISSLPDIVF TINGVQYPLPPSAYILQSQGSCTSGFQGMDVP 354 

Qy 386 S Q S S T GT VMGAVI ME G F YWFD RARKRI G FA 416 

::| ::| I : :: I I I I I —II 

Db 355 TESGELWILGDVFI RQYFTVFDRANNQVGLA 385 


RESULT 7 
A39314 

gastricsin (EC 3.4.23.3) precursor - bullfrog 
C; Species: Rana catesbeiana (bullfrog) 

C;Date: 19-Jun-1992 #sequence_revision 19-Jun-1992 #text_change 22-Jun-1999 
C;Accession: A39314 

R;Yakabe, E.; Tanji, M. ; Ichinose, M. ; Goto, S.; Miki, K. ; Kurokawa, K.; I to, 

H.; Kageyama, T.; Takahashi, K. 

J. Biol. Chem. 266, 22436-22443, 1991 

A;Title: Purification, characterization, and amino acid sequences of pepsinogens 

and pepsins from the esophageal mucosa of bullfrog (Rana catesbeiana) . 

A; Reference number: A39314; MUID: 92042186; PMID: 1939266 

A;Accession: A39314 

A; Status : preliminary 

A; Molecule type: mRNA 

A; Residues: 1-384 <YAK> 

A;Cross-references: GB:M73750; NID : g2 13687 ; PIDN :AAA4 953 0 . 1 ; PID:g213688 
C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; hydrolase; protein digestion 

Query Match 11.4%; Score 305; DB 2; Length 384; 

Best Local Similarity 24.5%; Pred. No. l.le-16; 

Matches 105; Conservative 66; Mismatches 146; Indels 112; Gaps 17; 

Qy 24 HGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRR — GSFVEMVDNLRGKSGQGYYVEMTV 81 

Ml: |: :| : :| : I || |::: 

Db 35 HGIKAPV ' VDPATKYYNNFATAFEPLANYMDMSYYGEI SI 73 


Qy 82 GSPPQTLNILVDTGSSNFAVGAAPHPFL HRYYQRQLS STYRDLRKGVYVP YTQ 134 

I : I I I : I I I I I I I I I : I : I I : I : : : I 

Db 74 GTPPQNFLVLFDTGSSNLWV PSTYCQSQACTNHPQFNPSQSSSYSSNQQQFSLQYGT 130 

Qy 135 GKWEGELGTDLVS I PHGPNVTVRANIA AITESDKFFINGSNWEGILGLAYAE 186 

I I I I I I I III : : I I I : : : : I M I I I I 

Db 131 GSLTGILGYDTVQI QNIAISQQEFGLSVTEPGTNFVy-AQFDGILGLAYPS 180 


Qy 


187 IARPDDSLEPFFDSLVKQTHVPN-LFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLY 24 5 


II: : : : | : I I : : I I I I I : I I : I : I 

Db 181 I A- - EGGATTVMQGMIQQNLINQPLFAFYLS GQQNSQN GGEVAFGGVDQNYY 230 

Qy 246 TGS LWYT P I RREWY YEVI I VRVEINGQD LKMDCKEYNYDKSIVDSGTTNLRLPKKVF 302 

: I : : : I I : I I : : : I : I I I I : I I I : I I : I I : I I 

Db 231 SGQIYWTPVTSETYWQIGIQGFSVNGQATGWCSQGC QGIVDTGTSLLTAPQSVF 284 

Qy 303 EAAVK S I KAAS STEKFPDG FW LG EQ L V- CWQ AGT T P WN I F P VI SLYLMGEVT 353 

: : : I I I I : I I : I I : | I : : : : 

Db 285 SSLMQSIGAQQDQN GQYAVSCSNIQSLPTISFTISGVSFPLPPSAYVLQQNS 336 

Qy 354 NQ SFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRAR 410 

I I I I : I : : : | I : : I I : I 

Db 337 GYCTIGIMPTYLPSQNGQPL WILGDVFLRQYYSVYDLGN 375 

Qy 411 KRIGFAVSA 419 

: : I I I : I 
Db 376 NQVGFAAAA 384 


RESULT 8 
PECH 

pepsin A (EC 3.4.23.1) precursor - chicken 
N;Alternate names: pepsinogen A 
C; Species: Gallus gallus (chicken) 

C;Date: 18-Apr-1984 #sequence_revision Ol-Dec-2000 #text__change 01-Dec-2000 

C;Accession: JE0370; A00984 

R; Sakamoto, N. ; Saiga, H. ; Yasugi, S. 

Biochem. Biophys . Res. Commun. 250, 420-424, 1998 

A;Title: Analysis of temporal expression pattern and cis-regulatory sequences of 
chicken pepsinogen A and C. 

A; Reference number: JE0370; MUID: 98440813; PMID: 9753645 
A;Accession: JE0370 
A; Status: preliminary 
A;Molecule type: mRNA 
A; Residues: 1-382 <SAK> 

A;Cross-references: GB: AB025281; NID : g4589837 ; PIDN : BAA76891 . 1 ; PID:g4589838 

R;Baudys, M. ; Kostka, V. 

Eur. J. Biochem. 136, 89-99, 1983 

A;Title: Covalent structure of chicken pepsinogen. 

A; Reference number: A00984; MUID: 84004412; PMID: 6617663 

A; Accession: A00984 

A;Molecule type: protein 

A;Residues: 16-87 , ' S 1 , 89-382 <BAU> 

C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; gastric juice; glycoprotein; hydrolase; protein 
digestion; stomach 

F;16-57/Domain: activation peptide #status experimental <APT> 
F;58-382/Product : pepsin A #status predicted <MAT> 
F; 92, 275/Active site: Asp ftstatus predicted 

F;105-110, 266-270, 305-338/Disulf ide bonds: #status experimental 

F; 128/Binding site: carbohydrate (Asn) (covalent) ((status experimental 


Query Match 11.4%; Score 304; DB 1; Length 382; 

Best Local Similarity 24.0%; Pred. No. 1.3e-16; 

Matches 88; Conservative 69; Mismatches 125; Indels 84; Gaps 13; 


Qy 75 Y YVEMT VG S P P QT LN I L VDT G S S N FAVGAAP H P F L HRYYQRQLS ST YRDLRKG 127 

II : : : I : I I : : I i I I I I I I : I : : I I I I = 

Db 74 YYGTI S I GTPQQDFTVI FDTGS SNLWV PSIYCKSSACSNHKRFDPSKSSTYVSTNET 130 

Qy 128 VYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDK-FFINGSNWEGILGLAYAE 186 

I I : I I I I I I I : : : : I : I : : I : : I I : : I II I I I : 

Db 131 VYIAYGTGSMSGILGYDTVAV SSIDVQNQIFGLSETEPGSFFYYCNFDGILGLAFPS 187 

Qy 187 IARPDDSLEPFFDSLVKQTHV-PNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLY 245 

I : | | | : : : | I : I I I : : I I I : : I I I I : 

Db 188 IS — SSGATPVFDNMMSQHLVAQDLFSVYLSKDG ETGS FVLFGGI DPNYT 235 

Qy 246 TGSLWYTPIRREWYYEVIIVRVEINGQDLK — MDCKEYNYDKSIVDSGTTNLRLPKKVFE 303 

I : : : I : I I : : : : I I : : : I : : I I I : I I : I : I : : 

Db 2 36 TKGIYWVPLSAETYWQITMDRVTVGNKYVACFFTC QAI VDT GT S L LVMP Q GAYN 289 

Qy 304 AAVKSIKAASSTE KFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQS 356 

: I : : I I III : : : : I 
Db 2 90 RIIKDLGVSSDGEISCDDISKLPD VTFHINGHA 322 

Qy 357 FRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGT VMGAVIMEGFYWFDRAR 410 

: I : : I II : : I I : : I I : I I I I 

Db 323 FTLPASAYVLNEDGSCMLGFENMGTPTELGEQWILGDVFIREYYVI FDRAN 373 

Qy 411 KRIGFA 416 

: : | : 

Db 374 NKVGLS 379 


RESULT 9 
A41443 

pepsin (EC 3.4.23.-) precursor, embryonic - chicken 
C; Species: Gallus gallus (chicken) 

C;Date: 05-Jun-1992 #sequence_revision 05-Jun-1992 #text_change 21-Jul-2000 
C; Access ion: A4144 3 

R;Hayashi, K. ; Agata, K . ; Mochii, M. ; Yasugi, S.; Eguchi, G.; Mizuno, T. 
J. Biochem. 103, 290-296, 1988 

A; Title: Molecular cloning and the nucleotide sequence of cDNA for embryonic 

chicken pepsinogen: phylogenetic relationship with prochymosin . 

A;Reference number: A41443; MUID: 88227903; PMID:3131317 

A;Accession: A41443 

A; Status: preliminary 

A; Molecule type: mRNA 

A; Residues: 1-383 <HAY> 

A;Cross-references:. GB:D00215; NID : g2760810 ; PIDN : BAA00153 . 1 ; PID:g222853 
C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; hydrolase; protein digestion 

Query Match 11.3%; Score 301.5; DB 2; Length 383; 

Best Local Similarity 25.2%; Pred. No. 2e-16; 

Matches 90; Conservative 76; Mismatches 124; Indels 67; Gaps 14; 

Qy 75 Y YVEMT VG S P P QT LN I LVDT G S S N FAVGA APHPFLHRYYQRQLSSTYRDLRKGVYV 130 

II : : : I : I I I : : II I M I I : : I I : : MM: : : : 

Db 76 YYGTISIGTPPQDFTWFDTGSSNLWVPSVSCTSPACQSHQMFNPSQSSTYKSTGQNLSI 135 


Qy 


131 PYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARP 190 


I I I I : I I I : : : : : : I : I I : : : I I I I I I : I 

Db 136 HYGTGDMEGTVGCDTVTVASLMDTNQLFGLST-SEPGQFFVY-VKFDGILGLGYPSLAA- 192 

Qy 191 DDSLEPFFDSLVKQTHV-PNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSL 24 9 

I : I I I : : I : : : I I I I : : I : I : : I I I I I : I I I : 

Db 193 -DGITPVFDNMVNESLLEQNLFSVYLS REPMGSMWFGGIDESYFTGSI 24 0 

Qy 250 WYTPIRREWYYEVIIVRVEINGQDL — KMDCKEYN YDKS I VDSGTTNLRLPKKVFEAAVK 307 

: I : : | : : : : : : I I : : I : : |. : I : I I : : I 

Db 241 NWIPVSYQGYWQISMDSIIVNKQEIACSSGC QAIIDTGTSLVAGPASDINDIQS 294 

Qy 308 SIKAASSTEKFPDGFWLGEQLVCVJQAGTTPWNIFPVISL YLMGEVTNQSFRITILP 363 

: : I : I III I : : : : : : : | : 
Db 2 95 AVGANQNT YGEYSV NCSHILAMPDWFVIGGI 32 6 

Qy 364 QQYLRPVEDVA TSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFA 416 

II || : I I I : I : : : I I : : I : I I I I I : I I 

Db 327 -QY — PVPALAYTEQNGQGTCMS S FQNS SADLWI LGDVFI RVYYS I FDRANNRVGLA 380 


RESULT 10 
A34401 

cathepsin E (EC 3.4.23.34) precursor - human 
C; Species: Homo sapiens (man) 

C;Date: 22-Jun-1990 #sequence_revision 22-Jun-1990 #text_change 22-Jun-1999 

C;Accession: A42038; A34401; S35663; S34467; A34643; B34643 

R;Azuma, T . ; Liu, W. ; Vander Laan, D.J.; Bowcock, A.M.; Taggart, R.T. 

J. Biol. Chem. 267, 1609-1614, 1992 

A;Title: Human gastric cathepsin E gene. Multiple transcripts result from 
alternative polyadenylation of the primary transcripts of a single gene locus at 
Iq31-q32. 

A; Reference number: A42038; MUID : 92 1 12877 ; PMID: 1370478 
A;Accession: A42038 
A; Molecule type: DNA 
A; Residues: 1-396 <AZU> 

A;Cross-references: GB:M84424; GB:M82847; NID:gl81203; PIDN : AAA52300 . 1; 
PID:gl81205 

A;Note: sequence extracted from NCBI backbone (NCBIN : 75963 , NCBIN: 75966, 
NCBIN:75971, NCBIN:75974, NCBIN:75977, NCBIN:75979, NCBIN:75981, NCBIN:75988, 
NCBIN:75990, NCBIP:75991) 

R;Azuma, T . ; Pals, G. ; Mohandas, T.K.; Couvreur, J.M. ; Taggart, R.T. 
J. Biol. Chem. 264, 16748-16753, 1989 

A;Title: Human gastric cathepsin E. Predicted sequence localization to 
chromosome 1, and sequence homology with other aspartic proteinases. 
A;Reference number: A34401; MUID: 89380302 ; PMID:2674141 
A;Accession: A34401 
A;Molecule type: mRNA 
A; Residues: 1-396 <AZ2> 

A;Cross-references: GB:J05036; NID:gl81193; PIDN : AAA52130 . 1 ; PID:gl81194 

R;Takeda-Ezaki, M. ; Yamamoto, K. 

Arch. Biochem. Biophys . 304, 352-358, 1993 

A; Title: Isolation and biochemical characterization of procathepsin E from human 
erythrocyte membranes . 

A; Reference number: S35663; MUID : 93349047; PMID: 8346912 
A;Accession: S35663 
A; Status: preliminary 
A;Molecule type: protein 


A; Residues: 20-38;54-76 <TAK> 
R;Hill, J.; Montgomery , D.S.; Kay, J. 
FEBS Lett. 326, 101-104, 1993 

A; Title: Human cathepsin E produced in E. coli. 

A; Reference number: S34467; MUID : 933147 62 ; PMID: 8325357 

A;Accession: S34467 

A; Status: preliminary 

A;Molecule type: protein 

A; Residues: 57-60,62-81 <HIL> 

R;Athauda, S.B.P.; Matsuzaki, O. ; Kageyama, T . ; Takahashi, K. 
Biocherrw Biophys . Res. Commun. 168, 878-885, 1990 

A; Title: Structural evidence for two isozymic forms and the carbohydrate 

attachment site of human gastric cathepsin E. 

A; Reference number: A34643; MUID : 902412 67 ; PMID:2334440 

A; Accession: A34 643 

A; Status: preliminary 

A;Molecule type: protein 

A;Residues: 54-58, 'XXX 1 , 62-64, 'NT , 66-89, 'X 1 , 91-95 <ATH> 

A;Accession: B34643 

A; Status: preliminary 

A;Molecule type: protein 

A;Residues: 54-59, 'X 61-68 <AT2> 

C; Genetics: 

A; Gene: GDB : CTSE 

A;Cross-references: GDB:119821; OMIM: 116890 
A; Map position: Iq31-lq31 
C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; blocked amino end; hydrolase; zymogen 
F; 1-17 /Domain : signal sequence ((status predicted <SIG> 
F; 18- 53/ Domain : activation peptide ((status predicted <PRO> 
F;54-396/Product: cathepsin E ((status predicted <MAT> 

F; 18 /Modi f ied site: blocked amino end (Gin) (in mature form) (probably 
pyrrolidone carboxylic acid) ((status experimental 
F; 96, 281/Active site: Asp ((status predicted 

Query Match 11.3%; Score 301.5; DB 2; Length 396; 

Best Local Similarity 25.8%; Pred. No. 2.1e-16; 

Matches 100; Conservative 68; Mismatches 144; Indels 75; Gaps 16; 

EPEEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGA 103 

: I I : : I I : : : : I I I I I : : I I I I I I I : 

SAKEP LINYLD MEYFGTI S I GS PPQNFTVI FDTGS SNLWVP SVYCT 110 


Qy 

48 

Db 

63 

Qy 

104 

Db 

111 

Qy 

164 

Db 

170 

QY 

223 

Db 

220 

Qy 

283 


I I I I : : I I I : I I I I : I I 


Mi l : I : | | | : : : | | : I I : : : 

:lglgypsla — VGGVTPVFDNMMAQNLVDLPMFSVYM 219 


I I : I I I I I I : : I I I : I : : : I : : : : : : : I III 

-SSNPEGGAGSELIFGGYDHSHFSGSLNWVPVTKQAYWQIALDNIQVGG — TVMFCSE — 274 


111:11:: I : : I I I II 


Db 275 G CQAI VDT GTSLITGPSDKI KQ LQN AI GAAP VDGEYAVE CANLNVMP 321 


Qy 


343 VISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTG 


391 


Db 


322 DVT FT I NG VPYTLSPTAY — TLLDFVDGMQFC 


SSGFQGLDIHPPAG 365 


Qy 


Db 


392 — TVMGAVIMEGFYWFDRARKRIGFA 416 

: : I I : I I I I I I I : I I 
366 PLWI LGDVFI RQFYSVFDRGNNRVGLA 392 


RESULT 11 
KHHUD 

cathepsin D (EC 3.4.23.5) precursor [validated] - human 
N;Alternate names: preprocathepsin D 
C; Species: Homo sapiens (man) 

C;Date: 28-Dec-1987 #sequence_revision 28-Dec-1987 #text_change 15-Sep-2000 

C;Accession: A25771; S30749; PC2066; 159236; 157716 

R; Faust, P.L.; Kornfeld, S.; Chirgwin, J.M. 

Proc. Natl. Acad. Sci. U.S.A. 82, 4910-4914, 1985 

A; Title: Cloning and sequence analysis of cDNA for human cathepsin D. 
A; Reference number: A25771; MUID: 85270436; PMID: 3927292 
A;Accession: A25771 
A;Molecule type: mRNA 
A; Residues: 1-412 <FAU> 

A; Cross-references: EMBL:M11233; NID:gl81179; PIDN : AAB59529 . 1 ; PID:gl81180 

R;Westley, B.R.; May, F.E.B. 

Nucleic Acids Res. 15, 3773-3786, 1987 

A; Title: Oestrogen regulates cathepsin D mRNA levels in oestrogen responsive 
human breast cancer cells. 

A; Reference number: S30749; MUID : 87231068 ; PMID: 3588310 
A; Accession: S3 07 4 9 
A; Molecule type: mRNA 
A; Residues: 1-412 <WES> 

A; Cross-references: EMBL:X05344; NID:g29677; PIDN : CAA2 8955 . 1 ; PID:g29678 
R;May, F.E.B. ; Smith, D.J.; Westley, B.R. 
Gene 134, 277-282, 1993 

A; Title: The human cathepsin D-encoding gene is transcribed from an estrogen- 
regulated and a constitutive start point. 

A; Reference number: PC2066; MUID: 94085791; PMID: 8262386 
A;Accession: PC2066 
A; Molecule type: DNA 
A; Residues: 1-23 <MAY> 

A;Cross-references: GB:L12980; NID:g291930; PIDN : AAA16314 . 1 ; PID:g455429 
A; Experimental source: MCF-7 cell 
R;Cavailles, V.; Augereau, P.; Rochefort, H. 
Proc. Natl. Acad. Sci. U.S.A. 90, 203-207, 1993 

A; Title: Cathepsin D gene is controlled by a mixed promoter, and estrogens 
stimulate only TATA-dependent transcription in breast cancer cells. 
A; Reference number: 159236; MUID: 93126342 ; PMID: 8419924 
A;Accession: 159236 

A; Status: translation not shown; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-22 <CAV1> 

A; Cross-references: GB:S52557; NID:g263124; PIDN: AAD13868 . 1; PID:g4261568 
R;Augereau, P.; Miralles, F. ; Cavailles, V.; Gaudelet, C. ; Parker, M. ; 
Rochefort, H. 


Mol. Endocrinol. 8, 693-703, 1994 

A; Title: Characterization of the proximal estrogen-responsive element of human 
cathepsin D gene. 

A;Reference number: 157716; MUID: 95021301; PMID:7935485 
A;Accession: 157716 

A; Status: translation not shown; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-22 <CAV2> 

A;Cross-references: GB:S74689; NID:g786350; PIDN : AAD14156 . 1; PID:g4261856 
R;Baldwin, E.T.; Bhat, T.N.; Gulnik, S.; Erickson, J.W. 
submitted to the Brookhaven Protein Data Bank, April 1993 
A; Reference number: A51839; PDB : 1LYA 

A;Contents: annotation; X-ray crystallography, 2.5 angstroms, residues 65- 
161;170-241 

R;Baldwin, E.T.; Bhat, T.N.; Gulnik, S.; Erickson, J.W. 
submitted to the Brookhaven Protein Data Bank, April 1993 
A;Reference number: A51840; PDB : 1LYB 

A;Contents: annotation; X-ray crystallography, 2.5 angstroms, with inhibitor 
residues 65-161; 170-241 

R;Baldwin, E.T.; Bhat, T.N.; Gulnik, S.; Hosur, M.V. ; Sowder II, R.C.; Cachau, 
R.E.; Collins, J.; Silva, A.M. ; Erickson, J.W. 
Proc. Natl. Acad. Sci. U.S.A. 90, 6796-6800, 1993 

A;Title: Crystal structures of native and inhibited forms of human cathepsin D: 
implications for lysosomal targeting and drug design. 
A; Reference number: A48229; MUID : 93342076; PMID: 8393577 
A;Contents: annotation; X-ray crystallography, 2.5 angstroms 
C'Comment: Cathepsin D is a ubiquitous lysosomal proteinase. 

C'Comment: In addition to the propeptide, residues 163-168 and 411-412 are 

proteolytically removed. Residues 169 and 170 are also partially removed/ 

C'Comment: The carbohydrate bound to 134-Asn contains a mannose-6-phosphate that 

is bound near 2 67-Lys and the phosphotransferase recognition region. 

C; Genetics : 

A; Gene: GDB : CTSD 

A;Cross-references : GDB: 120512; OMIM: 116840 
A; Map position: llpl5 . 5-llpl5 . 5 
C; Function: 

A; Description: limited specificity endopeptidase 
A; Pathway: intracellular protein degradation 
C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; glycoprotein; hydrolase; lysosome; protein 
degradation 

F; 1-20/Domain: signal sequence ((status predicted <SIG> 

F; 2 1-64 /Domain : propeptide #status predicted <PRO> 

F; 65-162, 169-4 10/ Product : cathepsin D #status experimental <MAT> 

F; 267 , 329-356/Region : phosphotransferase recognition 

F; 91-160, 110-117, 286-290, 329-366/Disulf ide bonds: ((status experimental 
F; 97 , 295/Active site: Asp ((status experimental 

F; 134, 263/Binding site: carbohydrate (Asn) (covalent) ((status experimental 

Query Match 11.3%; Score 300.5; DB 1; Length 412; 

Best Local Similarity 26.9%; Pred. No. 2.7e-16; 

Matches 123; Conservative 68; Mismatches 170; Indels 97; Gaps 21; 

Qy 5 LPWLLLWMGAGVLPAHGTQHGIRLPLR SGLGGAPLGL RLP 44 

111:1 II : I : I I I : I I : I : I 

Db 7 LPLALCLLAA PASAL VRI PLHKFT S I RRTMS EVGGS VEDLI AKGPVS KYSQAVP 60 


Qy 45 RETDEEPEEP GRRG S FVEMVDN L RGK S GQ G Y YVEMT VG S P P QT LN I L VDT G S S N FAVGAA 104 

I : I I : : I I I I : : I : I I I :: I I I I I I I : 

Db 61 AVTE GPIPEVLKNYMDAQ YYGEIGIGTPPQCFTWFDTGSSNLWVPSI 108 

Qy 105 PHPFL HRYYQRQLS ST YRDLRKGVYVP YTQQKWEGELGTDLVS I P 149 

I II I I I I : I I I I I I I : I 

Db 109 HCKLLDIACWIHHKYNSDKSSTYVKNGTSFDIHYGSGSLSGYLSQDTVSVPCQSASSASA 168 

Qy 150 HGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHV-P 208 

I I I : : : : I I I I : I I I : : : : I I I : I : : I I 

Db 169 LGGVKVERQVFGEATKQPGITFIAAKFDGILGMAYPRIS — VNN VL P VFDN LMQQKLVDQ 22 6 

Qy 209 NLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVE 268 

I : I I : I : I I I : : : I I I I I I I I : I : I : : I : : I I 

Db 227 NIFSFYL S RD P DAQ P GGELMLGGT D S KY YKG S L S YLNVT RKAYWQVHLDQVE 278 

Qy 269 I-NGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSI KAASSTEKFPDGFWLGEQ 327 

: : I I III : : | | | : M : : I I : I I : II 
Db 279 VASGLTL CKE — GCEAIVDTGTSLMVGPVDEVRELQKAIGAVPLIQ GEY 325 

Qy 328 LV-CWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAIS 386 

: : I : I I I : I I I : : : : : I : I I : I : 

Db 326 MIPCEKVST LPAITLKLGG KGYKLS — PEDYTLKVSQAGKTL — CLSGFMG 372 

Qy 387 Q S STGTVMGAVIMEGFYWFDRARKRI GFAVSA 419 

I : : I I : : I I I I I I : I I I : I 

Db 373 MDIPPPSGPLWILGDVFIGRYYTVFDRDNNRVGFAEAA 410 


RESULT 12 
C38302 

pepsin (EC 3.4.23.-) II-2/3 precursor - rabbit 

C; Species: Oryctolagus cuni cuius (domestic rabbit) 

C;Date: 14-Jun-1991 #sequence__revision 14-Jun-1991 #text_change 23-Feb-1997 
C; Accession: C38 302 

R; Kageyama, T.; Tanabe, K. ; Koiwai, O. 
J. Biol. Chem. 265, 17031-17038, 1990 

A; Title: Structure and development of rabbit pepsinogens. Stage-specific 
zymogens, nucleotide sequences of cDNAs, molecular evolution, and gene 
expression during development. 

A; Reference number: A38302; MUID : 91009127 ; PMID:2129536 

A; Accession: C38302 

A; Status: preliminary 

A;Molecule type: mRNA 

A; Residues: 1-387 <KAG> 

A; Cross-ref erences : GB:J05638 

C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; hydrolase; phosphoprotein; protein digestion 

Query Match 11.2%; Score 299; DB 2; Length 387; 

Best Local Similarity 26.9%; Pred. No. 3.3e-16; 

Matches 97; Conservative 64; Mismatches 134; Indels 66; Gaps 13; 

Qy 75 Y YVEMT VG S P P QT LN I LVDT G S S N FAVGAAP H P F LHRYYQRQLS STYRDLRKG 127 

I : : : : I : I I I : : I I I I I I I I : I I : : : I I I I : : 

Db 75 YFGTISIGTPPQDFTVIFDTGSSNLWV PSTYCSSLACALHKRFNPEDSSTYQGTSET 131 


Qy 128 VWPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEI 187 

: : I I Mill: : : : II : : : I I I I I I I I 

Db 132 LSITYGTGSMTGILGYDTVKVGSIEDTNQIFGLSKTEPSLTFLF — APFDGILGLAYPSI 189 

Qy 188 ARPDDSLEPFFDSLVKQTHV-PNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYT 24 6 

: | : Ml:: : I : I I I : : I I : : I I I I' I I I 

Db 190 SSSDAT--PVFDNMWNEGLVSQDLFSVYLSSDD EKGSLVMFGGIDSSYYT 237 

Qy 247 G S LW YT P I RREW Y YEVI I VRVE I NGQ DLKM — DCKEYNYDKSIVDSGTTNLRLPKKVFEA 304 

I I I : I : II:::: I I I I : : I : : M I : I I : I I : 
Db 238 G S LNWVP VS YEG YWQ I TMD S VS I N GET I ACAD S C QAIVDTGTSLLTGP TS 287 

Qy 305 AVKSIKAASSTEKFPDGFWLGEQLV-CWQAGTTPWNIFPVISLYLMGEVTNQSFRITILP 363 

I : : I : : I I I I : : I : I : I II 

Db 288 AISNIQSYIGASK NLLGENVISCSAIDSLPDIVF TING 325 

Qy 364 QQYLRPVEDVATSQDDCYKFAISQSSTGT VMGAVIMEGFYWFDRARKRIGFAV 417 

Ml : I I : : I : : I I : : : I I I I I : : I I 

Db 326 IQYPLPASAYILKEDDDCTSGLEGMNVDTYTGELWILGDVFIRQYFTVFDRANNQLGLAA 385 

Qy 418 S 418 

Db 386 A 386 


RESULT 13 
D38302 

pepsin (EC 3.4.23.-) II-4 precursor - rabbit 

C; Species: Oryctolagus cuniculus (domestic rabbit) 

C;Date: 14-Jun-1991 #sequence_revision 20-Sep-1991 #text_change 23-Feb-1997 
C; Accession: D38302 

R;Kageyama, T.; Tanabe, K. ; Koiwai, O. 
J. Biol. Chem. 265, 17031-17038, 1990 

A; Title: Structure and development of rabbit pepsinogens. Stage-specific 
zymogens, nucleotide sequences of cDNAs, molecular evolution, and gene 
expression during development. 

A; Reference number: A38302; MUID : 91009127 ; PMID:2129536 
A; Accession: D38302 
A; Status: preliminary 
A; Molecule type: mRNA 
A; Residues: 1-387 <KAG> 

A; Cross-references: GB:M59235; GB:J05638 
C; Superfamily : pepsin 

C; Keywords: aspartic proteinase; hydrolase; phosphoprotein; protein digestion 

Query Match 11.2%; Score 298; DB 2; Length 387; 

Best Local Similarity 26.1%; Pred. No. 3.9e-16; 

Matches 97; Conservative 66; Mismatches 122; Indels 86; Gaps 14; 

Qy 75 Y YVEMT VG S P P QT LN I L VDT G S S N FAVGAAP H P F LHRYYQRQLS ST YRDLRKG 127 

I : : : : I : I I I : : I I I M I I I : II:: : I I I I : : 

Db 75 YFGTISIGTPPQDFTVIFDTGSSNLWV PSTYCSSLACALHKRFNPEDSSTYQGTSET 131 

Qy 128 VYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFF INGSNWE 177 

: : I I I I I I I : :| :::: | : :: 

Db 132 LSITYGTGSMTGILGYDTV KVGSIEDTNQIFGLSKTEPGLTFLFAPFD 179 


Qy 178 GI LGLAYAEIARPDDSLEPFFDS LVKQTHV- PNLFS LHLCGAGFPLNQS EVLASVGGSMI 236 

I I I I I I I I : I : III:: : I : I I I : : I I : : 

Db 180 GILGLAYPSISSSDAT — PVFDNMWNEGLVSQDLFSVYLSSDD EKGSLVM 227 

Qy 237 I GGI DHSLYTGSLWYTPI RREWYYEVI I VRVEINGQDLKM — DCKEYNYDKS I VDSGTTN 294 

I I I I I I I I I I : I : II:::: MM:: | : : I I I : I I : 

Db 228 FGGIDSSYYTGSLNWVPVSYEGYWQITMDSVSINGETIACADSC QAIVDTGTSL 281 

Qy 295 LRLPKKVFEAAVKS I KAAS STEKFPDGFWLGEQLV- CWQAGTT PWN I FPVI S L YLMGEVT 353 

|| :|: :|:: I III :: I M :| 
Db 282 LTGP TSAISNIQSYIGASK NLLGENVISCSAIDSLPDIVF 321 

Qy 354 NQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGT VMGAVI ME G F YWFD 407 

|| M I : I I : : I : : | | : : : | I I 

Db 322 TINGIQYPLPASAYILKEDDDCTSGLEGMNVDTYTGELWILGDVFIRQYFTVFD 375 

Qy 408 RARKRIGFAVS 418 

II : : I I : 

Db 376 RANNQLGLAAA 386 


RESULT 14 
KHRTD 

cathepsin D (EC 3.4.23.5) precursor - rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 31-Dec-1991 #sequence_revision 31-Dec-1991 #text_change 18-Jun-1999 

C;Accession: S13111; C31918; JQ1177; PQ0222 

R;Birch, N.P.; Loh, Y.P. 

Nucleic Acids Res. 18, 6445-6446, 1990 

A;Title: Cloning, sequence and expression of rat cathepsin D. 
A;Reference number: S13111; MUID : 91057150 ; PMID:2243802 
A;Accession: S13111 
A; Molecule type: mRNA 
A; Residues: 1-407 <BIR> 

A; Cross-references: EMBL:X54467 ; NID:g55881; PIDN : CAA3834 9 . 1 ; PID:g55882 
R;Yonezawa, S.; Takahashi, T . ; Wang, X.; Wong, R.N.S.; Hartsuck, J.A. ; Tang, J. 
J. Biol. Chem. 263, 16504-16511, 1988 

A;Title: Structures at the proteolytic processing region of cathepsin D. 

A; Reference number: A92681; MUID : 8 9034 127 ; PMID: 3182800 

A;Accession: C31918 

A;Molecule type: protein 

A; Residues: 134-162, 1 T 1 , 164-170 <YON> 

R;Fujita, H. ; Tanaka, Y.; Noguchi, Y. ; Kono, A.; Himeno, M. ; Kato, K. 
Biochem. Biophys . Res. Commun . 179, 190-196, 1991 

A;Title: Isolation and sequencing of a cDNA clone encoding rat liver lysosomal 
cathepsin D and the structure of three forms of mature enzymes. 
A; Reference number: JQ1177; MUID : 9135424 9 ; PMID: 1883350 
A;Accession: JQ1177 
A;Molecule type: mRNA 

A;Residues: 1-14 , 1 A' , 16-204 , ' N 1 , 206-2 61, ' N ', 263-407 <FUJ> 
A; Accession: PQ0222 
A;Molecule type: protein 

A;Residues: 65-74 ; 118-127 ; 165-174 <FU2> 
A; Experimental source: liver 

C; Comment: Cathepsin D in rat liver lysosome occurs as a mixture of both a 
single chain form and two types of two chain forms. 
C; Function : 


A; Description : limited specificity endopeptidase 
A; Pathway: intracellular protein degradation 
C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; glycoprotein; hydrolase; lysosome; protein 
degradation 

F; 1-20/Domain : signal sequence #status predicted <SIG> 
F; 2 1-64 /Domain: propeptide ftstatus predicted <PRO> 

F;65-407/Product: cathepsin D, 43K single-chain form #status predicted <MAT> 
F; 6 5-1 64 /Product : (or 65-165) cathepsin D 12K light chain jfstatus predicted 
<MA2> 

F; 65-117/Product : cathepsin D 9K light chain ((status predicted <MA4> 

F; 11 8-4 07/ Product : cathepsin D 34K heavy chain #status predicted <MA5> 

F; 165-4 07 /Product : (or 166-407) cathepsin D 30K heavy chain ((status predicted 

<MA3> 

F;91-160, 110-117,281-285, 324-361/Disulfide bonds: ((status predicted 
F; 97 , 2 90 /Active site: Asp #status predicted 

F; 134 , 25 8 /Binding site: carbohydrate (Asn) (covalent) ftstatus predicted 

Query Match 11.1%; Score 297; DB 1; Length 407; 

Best Local Similarity 26.1%; Pred. No. 5.1e-16; 

Matches 118; Conservative 76; Mismatches 170; Indels 88; Gaps 20; 

Qy 6 PWLLLWMGAGVLPAHGTQHGIRLPLR SGLGGA — PLGLRLPRETDEEPEEP 54 

I : i I : I : I I : I I : I I I : : I I : I I : I I 

Db 4 PGVLLLI -LGLLDAS SSAL- 1 RI PLRKFTS I RRTMTEVGGSVEDLI LKGPITKYSMQS S P 61 

Qy 55 GRRGS FVEMVDNLRGKS GQGYYVEMTVGS P PQTLN I LVDTGS SN FAVGAAPHP FL 109 

: I : : I I I I : : I : I I I : : I I I I I I I : I 

Db 62 RTKEPVSELLKNYLDAQ YYGEIGIGTPPQCFTWFDTGSSNLWVPSIHCKLLDIACW 118 

Qy 110 -HRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDK 168 

II I I I I : I I I I I I I : I : : : : | : 

Db 119 VHHKYNSDKSSTYVKNGTSFDIHYGSGSLSGYLSQDTVSVP CKSDLGGIKVEKQ 172 

Qy 169 FF INGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHV- PNLFSLHLCG 217 

I : : : | | | | : | I : : : I I I : j : | I I : I I : I 

Db 173 I FGEATKQPGWFIAAKFDGI LGMGYPFI S — VNKVLPVFDNLMKQKLVEKNIFSFYL — 228 

Qy 218 AGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMD 277 

: I I : : : I I I I I I I : | : | : : | : : : | : | : | : 

Db 229 NRDPTGQPGGELMLGGTDSRYYHGELSYLNVTRKAYWQVHMDQLEV-GSELTL- 280 

Qy 278 CKEYNYDKSIVDSGTTNLRLPKKVFEAAVKS1KAASSTEKFPDGFWLGEQLV-CWQAGTT 336 

II : : I I I : I I : I I : hi I : | | : : | : : 

Db 281 CK — GGCEAIVDTGTSLLVGPVDEVKELQKAIGAVPLIQ GEYMIPCEKVSS- 329 

Qy 337 PWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAIS Q 387 

I : I : I I : : I : : I : I : : I 

Db 330 LPIITFKLGGQ NYELHPEKYI LKVSQAGKT ICLSGFMGMDI PPP 373 

Qy 38 8 SSTGTVMGAVIMEGFYWFDRARKRIGFAVSA 419 

I : : I I : : I I I I I I : I I I : I 

Db 374 SGPLWILGDVFIGCYYTVFDREYNRVGFAKAA 405 


RESULT 15 
A43356 


cathepsin E (EC 3.4.23.34) precursor - guinea pig 

N;Alternate names: erythrocyte membrane aspartic proteinase; slow-moving 
proteinase 

C; Species: Cavia porcellus (guinea pig) 

C;Date: 31-Dec-1993 #sequence_revision 31-Dec-1993 #text_change 22-Jun-1999 
C;Accession: A43356 

R;Kageyama, T.; Ichinose, M. ; Tsukada, S.; Miki, K. ; Kurokawa, K. ; Koiwai, 0.; 
Tanji, M. ; Yakabe, E.; Athauda, S.B.; Takahashi, K. 
J. Biol. Chem. 267, 16450-16459, 1992 

A;Title: Gastric procathepsin E and progastricsin from guinea pig. Purification, 
molecular cloning of cDNAs, and characterization of enzymatic properties, with 
special reference to procathepsin E. 

A; Reference number: A43356; MUID : 92355614 ; PMID: 1644829 
A; Accession: A43356 
A; Molecule type: mRNA 
A; Residues: 1-391 <KAG> 

A;Cross-references : GB:M88653; NID : gl91294 ; PIDN : AAA37 052 . 1 ; PID:gl91295 
A;Note: sequence extracted from NCBI backbone (NCBIN: 110763, NCBIP : 110769 ) 
C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; hydrolase; membrane protein 

Query Match 11.1%; Score 295; DB 2; Length 391; 

Best Local Similarity 26.9%; Pred. No. 6.9e-16; 

Matches 98; Conservative 64; Mismatches 130; Indels 72; Gaps 16; 

Qy 75 Y YVEMT VG S P P QT LN I LVDT G S S N FAVGA APHPFLHRYYQRQLSSTYRDLRKGVYV 130 

I : : : : I I I I I :: I I I I I I I : : I I : I I I I I I : : : 

Db 74 YFGTISIGSPPQNFTVI FDTGSSNLWVPSVYCTSPACQTHPVFHPSLSSTYREVGNSFSI 133 

Qy 131 PYTQGKWEGELGTDLVSIPHGPNVTVRTVMIAAITESDKFFINGSNWEGILGLAYAEIARP 190 

I I I : I I I I : I I : : : I | | : : : : : I || I | I : I 

Db 134 QYGTGSLTGIIGADQVSV-EGLTWGQQFGESVQEPGKTFVH-AEFDGILGLGYPSLAA- 190 

Qy 191 DDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLW 250 

: I I I : : : I I I : I : I : I I : I I I I : : I I I 

Db 191 - GGVT P VFDNMMAQ NLVALPM FSVYMSSNPGGSGSELTFGGYDPSHFSGSLN 241 

Qy 251 YT P I RREWY YEVI I VRVEI NGQDLKMDCKEYN YDKS I VDS GTTNLRLPKKVFEAAVKS I K 310 

: I : : : I : : : : : : : I III : : I I I : I I : : I : I : : 

Db 242 WVPVTKQAYWQIALDGIQVG — DSVMFCSE — GCQAIVDTGTSLITGP PGKIKQLQ 2 93 

Qy 311 AASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRI TILPQQ 365 

I : I : : I : I : I : I I II I : I 

Db 294 EALGATYVDEGY SVQC AN LNMML DVT FI INGVPYTLNPTA 333 

Qy 366 YLRPVEDVATSQDDCYKFAISQSSTG TVMGAVIMEGFYWFDRARKR 412 

I : I I III :: I I : I I I I I I I 

Db 334 Y--TLLDFVDGMQVC STGFEGLEIQPPAGPLWILGDVFIRQFYAVFDRGNNR 383 

Qy 413 IGFA 416 

: I I 

Db 384 VGLA 387 


Search completed: January 21, 2004, 09:26:07 
Job time : 47.9809 sees 


GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 


OM protein - protein search, using sw model 
Run on : 


Title: 

Perfect score: 
Sequence : 

Scoring table: 


Searched: 


January 21, 2004, 09:25:15 ; Search time 100.583 Seconds 

(without alignments) 
1018.511 Million cell updates/sec 

US-09-869-414A-4 
2664 

1 MAQAL P WL LLWMGAGVL PAH CLRCLRQQHDDFADDISLLK 501 


BLOSUM62 
Gapop 10.0 


Gapext 0.5 


762491 seqs, 204481190 residues 


Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 


762491 


Post-processing : 


Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 


Database 


Published_Applications_AA: * 
/ cgn2_6/ptodata/2/pubpaa 
/ cgn2_6/ptodata/2/pubpaa 
/ cgn2_6/ ptodata/ 2 / pubpaa 
/ cgn2_6/ptodata/2 /pubpaa 
/cgn2_6/ptodata/ 2 /pubpaa 
/ cgn2_6/ ptodata/ 2/pubpaa 
/ cgn2_6/ ptodata/ 2/pubpaa 
/cgn2_6/ ptodata/ 2 /pubpaa 
/cgn2_6/ ptodata/2/pubpaa 
/cgn2_6/ptodata/2/pubpa 
/ cgn2_6/ptodata/2/pubpa 
/ cgn2_6/ptodata/ 2/pubpa 
/ cgn2_6/ptodata/ 2/pubpa 
/ cgn2_6/ptodata/2/pubpa 
/ cgn2_6/ptodata/2/pubpa 
/ cgn2_6/ptodata/ 2/pubpa 
/ cgn2_6/ptodata/2/pubpa 
/ cgn2__6/ptodata/2/pubpa 


1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 


/US07_PUBCOMB.pep: * 
/ PCT_NEW_PUB . pep : * 
/US06_NEW_PUB.pep: * 
/US 0 6_PUBCOMB . pep : * 
/US07_NEW_PUB.pep: * 
/ PCTUS_PUBCOMB . pep : * 
/US08_NEW_PUB.pep: * 
/US 0 8_PUBCOMB . pep : * 
/US 0 9A_PUBCOMB . pep : + 
a/US 09B_PUBCOMB. pep: 
a/US09C_PUBCOMB.pep: 
a/US 09_NEW_PUB. pep: * 
a/US10A_PUBCOMB . pep : 
a/US10B_PUBCOMB.pep: 
a/US10C_PUBCOMB . pep : 
a/US 10_NEW_PUB. pep: * 
a/US60_NEW_PUB.pep: * 
a/US 60__PUBCOMB . pep : * 


Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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44 
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Sequence 30, Appl 
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2397 
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453 

11 

US-09-869-414-30 

Sequence 30, Appl 


ALIGNMENTS 


RESULT 1 
US-09-794-927-4 

; Sequence 4, Application US/09794927 

; Patent No. US20010016324A1 

; GENERAL INFORMATION: 

; APPLICANT: Gurney, Mark E. 


i 


APPLICANT: Bienkowski, Michael J. 
APPLICANT: Heinrikson, Robert L. 
APPLICANT: Parodi, Luis A. 
APPLICANT: Yan, Riqiang 

TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 
AND 

TITLE OF INVENTION: USES 
TITLE OF INVENTION: THEREFOR 
FILE REFERENCE: 28341/6280FG 
CURRENT APPLICATION NUMBER: US/09/794,927 
CURRENT FILING DATE: 2001-02-27 
PRIOR APPLICATION NUMBER: 09/416,901 


1999-10-13 
NUMBER: 60/155, 493 

1999-09-23 
NUMBER: 09/404, 133 

1999-09-23 
NUMBER: PCT/US99/208 8 1 

1999-09-23 
NUMBER: 60/101, 594 

1998-09-24 


PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE : Patentln Ver . 2.0 
SEQ ID NO 4 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-794-927-4 

Query Match 100.0%; Score 2664; DB 9; Length 501; 

Best Local Similarity 100.0%; Pred. No. 2.6e-253; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I. I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

I I I M I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I 
Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

Qy 121 YRDLRKGVTVPYTQGKWEGELGTDLVSIPHGPNWVRANIAAITESDKFFINGSNWEGIL 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSI PHGPNVTVRANIAAITESDKFFINGSNWEGIL 18 0 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

II I I I I I I I I I M I I I M I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 

Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 


QY 
Db 


301 
301 


VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 
I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I 1 I I I I I I I I I I I I I 
VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 


360 
360 


Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I | I | II | I M I II I I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDI SLLK 5 01 

M I I I I I I I I I I I II I I I I I I 
Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 2 
US-09-795-847-4 

Sequence 4, Application US/09795847 
Patent No. US20010018208A1 
GENERAL INFORMATION: 
APPLICANT: Gurney, Mark E. 
APPLICANT: Bienkowski, Michael J. 
APPLICANT: Heinrikson, Robert L. 
APPLICANT: Parodi, Luis A. 
APPLICANT: Yan, Riqiang 


AND 


TITLE OF INVENTION: ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 

) 

TITLE OF INVENTION: USES 

TITLE OF INVENTION: THEREFOR 

FILE REFERENCE: 28341/6280DE 

CURRENT APPLICATION NUMBER: US/09/795, 847 

CURRENT FILING DATE: 2001-02-28 


PRIOR APPLICATION NUMBER 
PRIOR FILING DATE: 1999 
PRIOR APPLICATION NUMBER 
PRIOR FILING DATE: 1999 
PRIOR APPLICATION NUMBER 
PRIOR FILING DATE: 1999 
PRIOR APPLICATION NUMBER 
PRIOR FILING DATE: 1999 
PRIOR APPLICATION NUMBER 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 4 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-795-847-4 


09/416, 901 
10-13 

60/155, 493 
09-23 

09/404, 133 
09-23 

PCT/US99/20881 
09-23 

60/101,594 


Query Match 100.0%; Score 2664; DB 9; Length 501; 

Best Local Similarity 100.0%; Pred. No. 2.6e-253; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; 


Gaps 


0; 


Qy 

Db 


1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 
1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 


Qy 


61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 


Db 61 ' VEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPHP FLHRYYQRQLS ST 120 

Qy 121 YRDLRKGVWPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVS I PHGPNVTVRANIAAITES DKFFINGSNWEGI L 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I | | | | | | | | | I I I I I I I I 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I II I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SL YLMGEVTNQS FRIT 360 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SL YLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I M I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I I I I I I I I I I I I 
Db 481 RCLRCLRQQHDDFADDISLLK 501 


RESULT 3 
US-09-794-743-4 

Sequence 4, Application US/09794743 
Patent No. US2001002 1391A1 
GENERAL INFORMATION: 
APPLICANT: Gurney, Mark E. 
APPLICANT: Bienkowski, Michael J. 
APPLICANT: Heinrikson, Robert L. 
APPLICANT: Parodi, Luis A. 
APPLICANT: Yan f Riqiang 

TITLE OF INVENTION: ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 


AND 


TITLE OF INVENTION: USES 

TITLE OF INVENTION: THEREFOR 

FILE REFERENCE: 28341/6280BC 

CURRENT APPLICATION NUMBER: US/09/794,743 

CURRENT FILING DATE: 2001-02-27 


PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 


NUMBER 
1999- 

NUMBER 
1999 

NUMBER 
1999 

NUMBER 


09/416, 901 
10-13 

60/155,493 
09-23 

09/404, 133 
09-23 

PCT/US99/20881 


1999-09-23 


PRIOR APPLICATION NUMBER: 60/101,594 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 4 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-794-743-4 

Query Match 100.0%; Score 2664; DB 9; Length 501; 

Best Local Similarity 100.0%; Pred. No. 2.6e-253; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 

Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I 
Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I E I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 4 80 

I I I I I I I I I I I I I I I I I I I I I I I ! I I II I I I I I I I I I I I II I I I I I I I II I I I I I I II II 
Db 421 HVllDEFRTAAVEGPFvTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I I I I I I I II I I I I I I M I I 
Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 4 
US-09-794-748-4 

; Sequence 4, Application US/09794748 

; Patent No. US20020037315A1 

; GENERAL INFORMATION: 

; APPLICANT: Gurney, Mark E. 


APPLICANT: Bienkowski, Michael J. 
APPLICANT: Heinrikson, Robert L. 
APPLICANT: Parodi, Luis A. 
APPLICANT : Yan, Riqiang 

TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 
AND 

TITLE OF INVENTION: USES 
TITLE OF INVENTION: THEREFOR 
FILE REFERENCE: 2 8341/6280 JL 
CURRENT APPLICATION NUMBER: US/09/794,748 
CURRENT FILING DATE: 2001-02-27 
PRIOR APPLICATION NUMBER: 09/416,901 
PRIOR FILING DATE: 1999-10-13 
PRIOR APPLICATION NUMBER: 60/155,493 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 09/404,133 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: PCT/US99/20881 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 60/101,594 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 4 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-794-748-4 

Query Match 100.0%; Score 2664; DB 9; Length 501; 

Best Local Similarity 100.0%; Pred. No. 2.6e-253; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I ! I I I M I I I I I I 1 I I I I I I I I I I I I I I I ! I I 

Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDN L RG K S GQ G Y YVEMT VG SPPQTLNI LVD T G S S N FAVGAAP H P F LH R Y YQ RQ L S S T 12 0 

II I I I I I I I I I I M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

Qy 121 YRDLRKGVTVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNWVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAASSTEKFPDGFWLGEQLVCWQAGTT PWNI FPVI SLYLMGEVTNQS FRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 


Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFWLDMEDCGYNIPQTDESTLMTIAYVMAAIC7VLFMLPLCLMVCQW 480 

Qy 4 81 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I II I I I I I I I I I 
Db 4 81 RCLRCLRQQHDDFADDISLLK 501 


RESULT 5 
US-09-794-925-4 

Sequence 4, Application US/09794925 
Patent No. US20020064819A1 
GENERAL INFORMATION: 
APPLICANT: Gurney, Mark E. 

Bienkowski, Michael J. 
Heinrikson, Robert L. 
Parodi, Luis A. 
Yan, Riqiang 

ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 


APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 

TITLE OF INVENTION: 
AND USES 

TITLE OF INVENTION: THEREFOR 
FILE REFERENCE: 28341/6280HI 
CURRENT APPLICATION NUMBER: US/ 09/794, 925 
CURRENT FILING DATE: 2001-02-27 
PRIOR APPLICATION NUMBER: 09/416,901 

1999-10-13 
NUMBER: 60/155,493 

1999-09-23 
NUMBER: 09/404, 133 

1999-09-23 
NUMBER : PCT/US99/20881 

1999-09-23 
NUMBER: 60/101, 594 
1998-09-24 


PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln Ver. 2. 
SEQ ID NO 4 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-794-925-4 


0 


Query Match 100.0%; Score 2664; DB 9; Length 501; 

Best Local Similarity 100.0%; Pred. No. 2.6e-253; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I- I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPHPFLHRYYQRQLS ST 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 


Db 


61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 


Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI S LYLMGEVTNQS FRIT 360 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

Qy 361 I LPQQYLRPVEDVATSQDDCYKFAI SQ S STGTVMGAVIMEGFYWFDRARKRI GFAVSAC 420 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I 
Db 361 I LPQQYLRPVEDVATSQDDCYKFAISQS STGTVMGAVIMEGFYWFDRARKRI GFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I I I I I I I I I I I I I I I I I I I 

Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 6 
US-09-681-442-4 

Sequence 4, Application US/09681442 
Patent No. US2 002008 1 634A1 
GENERAL INFORMATION: 
APPLICANT: Gurney, Mark E. 
APPLICANT: Bienkowski, Michael J. 
APPLICANT: Heinrikson, Robert L. 
APPLICANT: Parodi, Luis A. 
APPLICANT: Yan, Riqiang 

TITLE OF INVENTION: ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 
AND USES 

TITLE OF INVENTION: THEREFOR 
FILE REFERENCE: 28341/6280FG 
CURRENT APPLICATION NUMBER: US/09/681, 442 
CURRENT FILING DATE: 2001-04-05 
PRIOR APPLICATION NUMBER: 09/416,901 
PRIOR FILING DATE: 1999-10-13 
PRIOR APPLICATION NUMBER: 60/155,493 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 09/404,133 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: PCT/US99/208 8 1 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 60/101,594 
PRIOR FILING DATE: 1998-09-24 


NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 4 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-681-442-4 

Query Match 100.0%; Score 2664; DB 9; Length 501; 

Best Local Similarity 100.0%; Pred. No. 2.6e-253; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VEMVDN L RG K S GQ G Y YVEMT VG S P P QT LN I L VDT G S S N FAVG AAP H P FLH R Y YQ RQ L S S T 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I 

Db 121 YRDLRKGWVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I II I I I I I I I I I II I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I 1 I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 VFEAAVKSI KAASSTEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

! I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 4 80 

I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 HVT1DEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I I I I I I I I I I I I I I I I I I I 

Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 7 
US-09-869-414-4 

; Sequence 4, Application US/09869414 
; Publication No. US20030077226A1 
; GENERAL INFORMATION: 

APPLICANT: Beinkowski et al. 
; TITLE OF INVENTION: ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 
AND USES 


TITLE OF INVENTION: THEREFOR 
FILE REFERENCE: 28341/6280M 

CURRENT APPLICATION NUMBER: US/09/8 69, 414 
CURRENT FILING DATE: 2001-06-27 
PRIOR APPLICATION NUMBER: 09/416,901 
PRIOR FILING DATE: 1999-10-13 
PRIOR APPLICATION NUMBER: 60/155,493 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 09/404,133 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: PCT/US99/20881 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER : 60/101,594 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 4 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-869-414-4 

Query Match 100.0%; Score 2664; DB 11; Length 501; 

Best Local Similarity 100.0%; Pred. No. 2.6e-253; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I M I I I I I 1 I I M I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I M ! I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKS GQGY YVEMTVGS P PQTLNI LVDTGS SN FAVGAAPHP FLHRYYQRQLS ST 120 

I I I I I I I I I I I I I I I I I M I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VEMVDN L RG K S GQ G Y YVEMT VG SPPQTLNI L VDT G S S N F AVGAAP H P FL H R Y YQ RQ L S S T 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMI IGGI 240 

I I I I I I I I I I I I I E I I I I ! II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI S LYLMGEVTNQS FRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI S LYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I i I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 


Qy 

Db 


421 
421 


HVHDEFRTAAVEGP FVTLDMEDCGYN I PQTDE STLMT I AYVMAAI CAL FML PLCLMVCQW 

I II I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 


480 
480 


Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I I II I I I I I I I I I I I I I I I 
Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 8 
US-09-548-366-4 

Sequence 4, Application US/09548366 
Publication No. US20030104365A1 
GENERAL INFORMATION: 


APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 


Gurney, Mark E. 
Bienkowski, Michael J. 
Heinrikson, Robert L. 
Parodi, Luis A. 
Yan, Riqiang 


AND 


TITLE OF INVENTION: ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 

TITLE OF INVENTION: USES THEREFOR 
FILE REFERENCE: 28341/6280A 
CURRENT APPLICATION NUMBER: US/09/548,366 
CURRENT FILING DATE: 2000-04-12 
PRIOR APPLICATION NUMBER: 60/155,493 

1999-09-23 
NUMBER: 09/404, 133 

1999-09-23 
NUMBER: PCT/US99/208 81 

1999-09-23 
NUMBER: 60/101, 594 
1998-09-24 


PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
NUMBER OF SEQ ID NOS : 65 
SOFTWARE: Patentln Ver. 2 
SEQ ID NO 4 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-548-366-4 


0 


Query Match 100.0%; Score 2664; DB 11; Length 501; 

Best Local Similarity 100.0%; Pred. No. 2.6e-253; 

Matches 501; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 


Qy 61 VEMVDNLRGKSGQGYYVEMTVGS P PQTLNI LVDTGS SNFAVGAAPHPFLHRYYQRQLS ST 120 

I I I I I I I I I I I I I I I I I I I I II I M I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 VEMVDNLRGKSGQGYYVEMTVGS P PQTLNI LVDTGS SNFAVGAAPHPFLHRYYQRQLS ST 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I 1 I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 


Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I 
Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 


Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS ST EKFPDGFWLGEQLVCWQAGTT PWN I FP VI S L YLMGEVTNQS FRI T 360 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCL^WCQW 480 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

II I I I I I I I I I M I I I I I I I I 

Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 9 
US-10-372-473-1 

; Sequence 1, Application US/10372473 

; Publication No. US2 004 000569 1A1 

; GENERAL INFORMATION: 

; APPLICANT: Chou, Kuo-Chen 

; APPLICANT: Howe, W. Jeffery 

; TITLE OF INVENTION: Modified BACE 

FILE REFERENCE: MBHB 01-1766-A 
; CURRENT APPLICATION NUMBER: US/ 10/372 , 473 
; CURRENT FILING DATE: 2003-02-21 
; NUMBER OF SEQ ID NOS : 24 
; SOFTWARE : Patentln version 3.2 
; SEQ ID NO 1 

LENGTH: 501 

TYPE: PRT 

ORGANISM: Homo sapiens 
; FEATURE: 

NAME/ KEY: MI S C_FEATURE 
; OTHER INFORMATION: Amino acid sequence of human BACE. 
US-10-372-473-1 

Query Match 99.7%; Score 2656; DB 12; Length 501; 

Best Local Similarity 99.8%; Pred. No. 1.6e-252; 

Matches 500; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDN L RG K S GQ G Y YVEMT VG S PPQTLN I LVDTGS SNFAVGAAPHPFLHRYYQRQLS ST 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I 
Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNI LVDTGS SNFAVGAAPHPFLHRYYQRQLS ST 120 


Qy 


121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 


I I I I I I I I I M I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 YRDLRKGVWPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVD5GTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

Qy 421 HAAHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 48 0 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

II I I I I I M I I I I I I II I I I I 

Db 481 RCLRCLRQQHDDFADDISLLK 501 


RESULT 10 
US-10-032-818-4 

; Sequence 4, Application US/10032818 

; Publication No. US20030092629A1 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Jordan J.N. 

; APPLICANT: Koelsch, Gerald 

; APPLICANT: Ghosh, Arun K. 

TITLE OF INVENTION: Inhibitors of Memapsin 2 and Use Thereof 

FILE REFERENCE: 2 932.1006-007 
; CURRENT APPLICATION NUMBER: US/ 1 0/032 , 8 18 
; CURRENT FILING DATE: 2001-12-28 
; PRIOR APPLICATION NUMBER: US 60/275,756 
; PRIOR FILING DATE: 2001-03-14 
; PRIOR APPLICATION NUMBER: US 60/258,705 

PRIOR FILING DATE: 2000-12-28 
; NUMBER OF SEQ ID NOS : 83 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 4 

LENGTH: 501 
; TYPE: PRT 

ORGANISM: Homo sapien 
US-10-032-818-4 


Query Match 99. 7%; 

Best Local Similarity 99.8%; 
Matches 500; Conservative 


Score 2656; DB 15; Length 501; 
Pred. No. 1.6e-252; 
0; Mismatches 1; Indels 0; Gaps 0; 


Qy 


1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 

Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYWEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 61 VEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPHP FLHRYYQRQLS ST 12 0 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFD'SLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS ST EKFPDGFWLGEQLVCWQAGTT PWNI FPVI S L YLMGEVTNQS FRI T 360 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI SLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I' I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNI PQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 48 0 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I I II I I I I I I I I I I I I I I I 
Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 11 
US-10-214-932-104 

; Sequence 104, Application US/10214932 

; Publication No. US20030100707A1 

; GENERAL INFORMATION: 

; APPLICANT: HWANG, Inhwan 

; APPLICANT: KIM, Dae Heon 

; APPLICANT: LEE, Yong Jik 

; TITLE OF INVENTION: SYSTEM FOR DETECTING PROTEASE 
; FILE REFERENCE: APB02/US 

; CURRENT APPLICATION NUMBER: US/ 10/2 14 , 932 

; CURRENT FILING DATE: 2002-08-08 

; NUMBER OF SEQ ID NOS : 133 

; SOFTWARE: Patentln version 3.1 

; SEQ ID NO 104 

; LENGTH: 501 

; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-10-214-932-104 


Query Match 


99.7%; Score 2656; DB 15; Length 501; 


Best Local Similarity 99.8%; Pred. No. 1.6e-252; 
Matches 500; Conservative 0; Mismatches 1; Indels 


0; Gaps 


0; 


Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 VEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPHP FLHRYYQRQLS ST 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSI PHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 GLAYAEI7VRPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy * 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVT SLY LMGEVTNQS FRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I 

Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNI PQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNI PQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 4 80 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

M I I I I I I I I I I I I I I I I I I I 
Db 481 RCLRCLRQQHDDFADDISLLK 501 


RESULT 12 
US-09-969-671A-2 

; Sequence 2, Application US/09969671A 

; Publication No. US20030036112A1 

; GENERAL INFORMATION: 

; APPLICANT : CHAPMAN, CONRAD G. 

; APPLICANT: MURPHY, KAY 

; APPLICANT: POWELL, DAVID J. 

; APPLICANT: SMITH, TRUDI S. 

; TITLE OF INVENTION: ASP2 

FILE REFERENCE: GH-70368-D1 
; CURRENT APPLICATION NUMBER: US/09/969, 671A 
; CURRENT FILING DATE: 2001-10-03 
; PRIOR APPLICATION NUMBER: UK 9701684.4 
; PRIOR FILING DATE: 1997-01-28 
; PRIOR APPLICATION NUMBER: 09/009,191 
; PRIOR FILING DATE: 1998-01-20 


PRIOR APPLICATION NUMBER: 09/694,200 
PRIOR FILING DATE: 2000-10-23 
NUMBER OF SEQ ID NOS : 6 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 2 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-969-671A-2 

Query Match 99.5%; Score 2650; DB 11; Length 501; 

Best Local Similarity 99.6%; Pred. No. 6.3e-252; 

Matches 4 99; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

I I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 
Db 61 VEMVDN L RGK S GQ G Y YVEMT VG S P P QT LN I L VDT G S S N FAVGAAP H P FLH R Y YQ RQ L S S T 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I I I 
Db 121 YRDLRKGVYEPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DHS LYTGS LWYTPI RREWYYEVI I VRVEINGQDLKMDCKEYNYDKS I VDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I- I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPI RREWYYEVI IVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI SLYLMGEVTNQS FRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI SLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I 

Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNI PQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I II I I I I I I II I I I I I I I I I I 

Db 481 RCLRCLRQQHDDFADDI SLLK 501 


RESULT 13 

US-10-372-730-9 

; Sequence 9, Application US/10372730 
; Publication No. US20030167486A1 
; GENERAL INFORMATION: 
; APPLICANT: Jacobsen, Helmut 


APPLICANT: Mosbach-Ozmen, Laurence 
APPLICANT : Nellboeck-Hochstetter , Peter 

TITLE OF INVENTION: Double transgenic animal model for Alzheimer's Disease 
FILE REFERENCE: Case 21132 

CURRENT APPLICATION NUMBER: US/10/372 , 730 
CURRENT FILING DATE: 2003-02-24 
PRIOR APPLICATION NUMBER: EP02004331.1 
PRIOR FILING DATE: 2002-03-01 
NUMBER OF SEQ ID NOS : 19 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 9 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-372-730-9 

Query Match 99.5%; Score 2650; DB 12; Length 501; 

Best Local Similarity 99.6%; Pred. No. 6.3e-252; 

Matches 499; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II I I I I I II I I I I I I I I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDN L RGK S GQ G Y YVEMT VG S P P QT LN I L VDT G S S N FAVGAAP H P FLH R Y YQ RQ L S S T 120 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 61 VEMVDN L RG K S GQ GY YVEMT VG S P PQT LN I L VDT G S S N FAVGAAP H P FLH R Y YQ RQ L S S T 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 121 YRDLRKGVYEPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMI IGGI 240 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 G LAY AEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMI IGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKSIKAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKSIKAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 421 HVliDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 4 80 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I II I I I I I I I I I I I I I 

Db 481 RCLRCLRQQHDDFADDISLLK 501 


RESULT 14 
US-10-308-365-2 

Sequence 2, Application US/10308365 
Publication No. US20030109022A1 
GENERAL INFORMATION: 
APPLICANT: CHAPMAN, CONRAD G. 
APPLICANT: MURPHY, KAY 
APPLICANT: POWELL, DAVID J. 
APPLICANT: SMITH, TRUDI S. 
TITLE OF INVENTION: ASP 2 
FILE REFERENCE: GH-7 0368-2 

CURRENT APPLICATION NUMBER: US/ 10/308 , 365 
CURRENT FILING DATE: 2002-12-03 
PRIOR APPLICATION NUMBER: US/09/694, 200 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: UK 9701684.4 
PRIOR FILING DATE: 1997-01-28 
PRIOR APPLICATION NUMBER: 09/009,191 
PRIOR FILING DATE: 1998-01-20 
NUMBER OF SEQ ID NOS : 6 

SOFTWARE: FastSEQ for Windows Version 3.0 
SEQ ID NO 2 
LENGTH: 501 
TYPE: PRT 

ORGANISM: HOMO SAPIENS 
US-10-308-365-2 

Query Match 99.5%; Score 2650; DB 15; Length 501; 

Best Local Similarity 99.6%; Pred. No. 6.3e-252; 

Matches 499; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I II M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I II I I I 

Db 61 VEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPHPFLHRYYQRQLS ST 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNWVR7^TIAAITESDKFFINGSNWEGIL 180 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 YRDLRKGVYEPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 24 0 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVI IVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 


QY 


361 


ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 
I I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I I I i I I I I 1 I I I I I I I I I I I I I I I I 


420 


Db 


361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 


Qy 421 HWDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVM7\AI CALFMLPLCLMVCQW 480 

I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I M I I I I I I I I I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFWLDMEDCGYNIPQTDESTLMTIAYVMAAICALFT^LPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

I M I I I I I I I I I I I I I I I I I I 
Db 481 RCLRCLRQQHDDFADDISLLK 501 


RESULT 15 
US-09-796-264-2 

Sequence 2, Application US/09796264 
Patent No. US20020049303A1 
GENERAL INFORMATION: 


APPLICANT 
APPLICANT 
APPLICANT 


Tang, Jordan J.N. 
Lin, Xinli 
Koelsch, Gerald 

TITLE OF INVENTION: Catalytically Active Recombinant Memapsin and Methods 
TITLE OF INVENTION: of Use Thereof 
FILE REFERENCE: OMRF 17 9 

CURRENT APPLICATION NUMBER: US/ 09/ 7 96, 2 64 
CURRENT FILING DATE: 2001-02-28 
PRIOR APPLICATION NUMBER: 09/604,608 
PRIOR FILING DATE: 2000-06-27 
PRIOR APPLICATION NUMBER: 60/168,060 
PRIOR FILING DATE: 1999-11-30 
PRIOR APPLICATION NUMBER: 60/177,836 
PRIOR FILING DATE: 2000-01-25 
PRIOR APPLICATION NUMBER: 60/178,368 
PRIOR FILING DATE: 2000-01-27 
PRIOR APPLICATION NUMBER: 60/210,292 
PRIOR FILING DATE: 2000-06-08 
NUMBER OF SEQ ID NOS : 31 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 2 

LENGTH: 488 

TYPE: PRT 

ORGANISM: Homo sapiens 


FEATURE : 

OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION : 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 


Purified Memapsin 2 

Amino Acids 2 8-4 8 are remnant putative propeptide 
residues 

Amino Acids 58-61, 78, 80, 82-83, 116, 118-121, 
156, 166, 174, 246, 274, 276, 278-281, 283, and 
376-377 are residues in contact with the OM99-2 
inhibitor 

Amino acids 54-57, 61-68, 73-80, 86-89, 109-111, 
113-118, 123-134, 143-154, 165-168, 198-202, and 
220-224 are N-lobe Beta Strands 

Amino Acids 184-191 and 210-217 are N-lobe Helices 
Amino acids 237-240, 247-249, 251-256, 259-260, 
273-275, 282-285, 316-318, 331-336, 342-348, 
354-357, 366-370, 372-375, 380-383, 390-395, 
400-405, and 418-420 are C-lobe Beta Strands 
.Amino Acids 286-299, 307-310, 350-353, 384-387, 


OTHER INFORMATION: and 427-431 are C-lobe Helices 
US-09-796-264-2 


Query Match 96.9%; Score 2582; DB 9; Length 488; 

Best Local Similarity 99.8%; Pred. No. 3e-245; 

Matches 487; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 14 AGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQ 73 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I 

Db 1 AGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQ 60 

Qy 74 GYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVYVPYT 133 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 GYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVYVPYT 12 0 

Qy 134 QGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDS 193 

I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I 1 I I I I I I I I I I 
Db 121 QGKWEGELGTDLVSIPHGPNWVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDS 18 0 

Qy 194 LEPFFDSLVKQTHVPNLFSLHLCGAGFPIiNQSEVLASVGGSMIIGGIDHSLYTGSLWYTP 253 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 181 LEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTP 240 

Qy 254 I RREWYYEVI IVRVEINGQDLKMDCKEYNYDKS I VDSGTTNLRLPKKVFEAAVKS I KAAS 313 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 I RREWYYEVI IVRVEINGQDLKMDCKEYNYDKS I VDSGTTNLRLPKKVFEAAVKS I KAAS 300 

Qy 314 STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI S L YLMGEVTNQS FRI T I L PQQYLRPVEDV 373 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I I I I I I I I I 

Db 301 STEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDV 360 

Qy 374 ATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHVHDEFRTAAVEG 433 

I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 361 ATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHVHDEFRTAAVEG 420 

Qy 434 PFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQWRCLRCLRQQHDDF 4 93 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 PFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQWRCLRCLRQQHDDF 48 0 

Qy 494 ADDISLLK 501 

I I I I I I I I 
Db 481 ADDISLLK 488 


Search completed: January 21, 2004, 09:41:41 
Job time : 101.583 sees 


GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 


OM protein - protein search, using sw model 
Run on: 


Title: 

Perfect score: 
Sequence : 

Scoring table: 


Searched: 


January 21, 2004, 09:16:19 ; Search time 103.457 Seconds 

(without alignments) 
1249.644 Million cell updates/sec 

US-09-869-414A-4 
2664 

1 MAQ AL P W L L LWMG AG VL PAH CLRCLRQQHDDFADDISLLK 501 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 

830525 seqs, 258052604 residues 


Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 


830525 


Database 


SPTREMBL_23 : * 
1 : sp_archea : * 
2: sp_bacteria : * 
3 : sp fungi : * 
4 : sp_human : * 
5: sp__invertebrate : * 
6: sp_mammal:* 
7: sp_mhc:* 
8: sp_organelle : * 
sp_phage : * 
sp plant :* 
sp_rodent : * 
sp_virus : * 
sp_vertebrate : * 
sp_unclassif ied: * 


9 

10 
11 
12 
13 
14 
15 
16 
17 


sp_rvirus : * 
sp bacteriap : * 
sp_archeap : * 


Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 


Result Query 

No. Score Match Length DB ID 


Description 


1 

2650 

99. 

5 

501 

4 

Q8IYC8 

Q8iyc8 homo sapien 

2 

2566 

96. 

3 

501 

11 

Q8C7R1 

Q8c7rl mus musculu 

3 

2562 

96. 

2 

501 

11 

Q8BQY4 

Q8bqy4 mus musculu 

4 

2478.5 

93. 

0 

532 

4 

Q9ULS1 

Q9ulsl homo sapien 

5 

2374 

89. 

1 

467 

11 

Q8C4F4 

Q8c4f4 mus musculu 

6 

1412 

53. 

0 

267 

11 

Q9CUU5 

Q9cuu5 mus musculu 

7 

1156 

43. 

4 

514 

11 

Q8C5E9 

Q8c5e9 mus musculu 

8 

1155.5 

43. 

4 

439 

4 

Q9H2V8 

Q9h2v8 homo sapien 

9 

1150 

43. 

2 

514 

11 

Q9JL18 

Q9jll8 mus musculu 

10 

1150 

43. 

2 

514 

11 

Q8C793 

Q8c7 93 mus musculu 

11 

1072.5 

40. 

3 

423 

4 

Q8N2D4 

Q8n2d4 homo sapien 

12 

974 .5 

36. 

6 

468 

4 

Q9NZL2 

Q9nzl2 homo sapien 

13 

969.5 

36. 

4 

396 

4 

Q9NZL1 

Q9nzll homo sapien 

14 

712.5 

26. 

7 

213 

4 

Q9P0D2 

Q9p0d2 homo sapien 

15 

596.5 

22. 

4 

255 

11 

Q9R1P7 

Q9rlp7 mus musculu 

16 

354.5 

13. 

3 

244 

5 

Q8WQY9 

Q8wqy9 aphrocallis 

17 

345 

13. 

0 

76 

4 

Q8N698 

Q8n698 homo sapien 

18 

335.5 

12. 

6 

391 

5 

Q9VKP6 

Q9vkp6 drosophila 

19 

335 

12. 

6 

354 

5 

Q9GYX7 

Q9gyx7 boophilus m 

20 

319 

12. 

0 

384 

13 

Q9DEC2 

Q9dec2 xenopus lae 

21 

313.5 

11. 

8 

385 

13 

Q9DEC4 

Q9dec4 rana catesb 

22 

312.5 

11. 

7 

386 

6 

Q9BGU5 

Q9bgu5 bos taurus 

23 

311 

11. 

7 

387 

6 

Q9GMY8 

Q9gmy8 sorex ungui 

24 

310 

11. 

6 

372 

5 

Q9VLK3 

Q9vlk3 drosophila 

25 

308 

11. 

6 

386 

6 

Q9GMY7 

Q9gmy7 rhinolophus 

26 

307.5 

11. 

5 

383 

13 

Q9DEC3 

Q9dec3 xenopus lae 

27 

307.5 

11. 

5 

387 

13 

Q9DDV5 

Q9ddv5 salvelinus 

28 

307 

11. 

5 

387 

6 

Q9GMY9 

Q9gmy9 suncus muri 

29 

306.5 

11. 

5 

383 

13 

Q9DE45 

Q9de45 salvelinus 

30 

305.5 

11. 

5 

376 

13 

Q9PUR8 

Q9pur8 pseudopleur 

31 

305 

11. 

4 

384 

13 

Q91322 

Q91322 rana catesb 

32 

304 

11. 

4 

382 

13 

Q9PRG9 

Q9prg9 gallus gall 

33 

304 

11. 

4 

423 

5 

Q9VKP7 

Q9vkp7 drosophila 

34 

298.5 

11. 

2 

386 

6 

Q9GMY6 

Q9gmy6 canis famil 

35 

296.5 

11. 

1 

396 

13 

093428 

09342 8 chionodraco 

36 

295.5 

11. 

1 

398 

13 

Q8JH28 

Q8jh28 brachydanio 

37 

295.5 

11. 

1 

398 

13 

Q8AWD9 

Q8awd9 brachydanio 

38 

293.5 

11. 

0 . 

381 

6 

Q9GK11 

Q9gkll camelus dro 

39 

293 

11. 

0 

399 

13 

093458 

093458 podarcis si 

40 

290.5 

10. 

9 

380 

6 

Q28950 

Q28950 sus scrofa 

41 

289.5 

10. 

9 

399 

13 

Q9DD89 

Q9dd8 9 brachydanio 

42 

288 

10. 

8 

387 

6 

Q8MJU4 

Q8mju4 oryctolagus 

43 

287.5 

10. 

8 

444 

5 

Q21966 

Q21966 caenorhabdi 

44 

287 

10. 

8 

427 

5 

P91802 

P91802 schistosoma 

45 

286.5 

10. 

8 

378 

13 

Q9PUR9 

Q9pur9 pseudopleur 


ALIGNMENTS 


RESULT 1 
Q8IYC8 

ID Q8IYC8 PRELIMINARY; PRT; 5 01 AA. 

AC Q8IYC8; 

DT 01-MAR-2003 (TrEMBLrel . 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 


DE Beta-site APP-cleaving enzyme. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

RA Strausberg R. ; 

RL Submitted (JUL-2002) to the EMBL/ GenBank/DDB J databases. 

DR EMBL; BC036084; AAH36084.1; -. 

SQ SEQUENCE 501 AA; 55824 MW; 7 68595CF5517EFB7 CRC64; 

Query Match 99.5%; Score 2650; DB 4; , Length 501; 

Best Local Similarity 99.6%; Pred. No. 1.7e-210; 

Matches 499; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 11 I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKS GQGYYVEMTVGS P PQTLN I LVDTGS SNFAVGAAPH P FLHRYYQRQLS ST 120 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I II 
Db 61 VEMVDNLRGKS GQGYYVEMTVGS P PQTLN I LVDTGSSNFAVGAAPHPFLHRYYQRQLFST 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I 

Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAY7VEI7VRPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVI IVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 DHSLYTGSLWYTPIRREWYYEVI I VRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI S L YLMGEVTNQS FRI T 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYN I PQTDESTLMT I AYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I II I I I I I I I I I I I I 
Db 4 81 RCLRCLRQQHDDFADDISLLK 501 


RESULT 2 
Q8C7R1 

ID Q8C7R1 PRELIMINARY; PRT; 501 AA. 


AC Q8C7R1; 

DT 01-MAR-2003 (TrEMBLrel . 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Beta-site APP cleaving enzyme. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC STRAIN=C57BL/6J; TISSUE=Spinal cord; 

RX MEDLINE-22354683; PubMed=12 4 66851 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 42 0:563-573(2 002). 

DR EMBL; AK049626; BAC33844.1; -. 

SQ SEQUENCE 501 AA; 55761 MW; B410DA8B64647663 CRC64; 

Query Match 96.3%; Score 2566; DB 11; Length 501; 

Best Local Similarity 96.0%; Pred. No. 1.5e-203; 

Matches 481; Conservative 8; Mismatches 12; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

II II I I I I I : I : I : I I I II I I I I I II I I I I I I I I I I I I I I 1 I I I I I I I I I I I 
Db 1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLAGPPLGLRLPRETDEESEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGY WEMTVGS PPQTLNI LVDTGS SNFAVGAAPHP FLHRYYQRQLS ST 120 

I I I I I I I I I I I I II I I I I I I : I I I I I I I I I I I II I I I I I M I I I I I I M I I I I I I I I I II 

Db 61 VEMVDNLRGKSGQGYYVEMTIGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 

Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

II I I I I I I I I II I I I I I I I II I I I I I : I I : I I I 1111111111:1 I I I I I I I I I I I I I 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTEALASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

I I II I I I I I I I I I I I I I I! I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I 
Db 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

I I I II I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I II I II I I I I I I I I II I I I I 
Db 361 I LPQQYLRPVEDVATSQDDCYKFAVSQS STGTVMGAVIMEGFYWFDRARKRI GFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 4 80 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II 

Db 421 HVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 


Qy 4 81 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I I I I I I I I I II 
Db 4 81 RCLRCLRHQHDDFADDISLLK 501 


RESULT 3 
Q8BQY4 

ID Q8BQY4 PRELIMINARY; PRT; 501 AA. 

AC Q8BQY4; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Beta-site APP cleaving enzyme. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_JTaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Brain ; 

RX MEDLINE=22354 683; PubMed=124 668 51 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 42 0:563-573(2002). 

DR EMBL; AK046175; BAC32620.1; -. 

SQ SEQUENCE 501 AA; 55816 MW; C0855513145E024E CRC64; 

Query Match 96.2%; Score 2562; DB 11; Length 501; 

Best Local Similarity 96.0%; Pred. No. 3.3e-203; 

Matches 481; Conservative 7; Mismatches 13; Indels 0; Gaps 0; 

Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

II II I I I I I : I : I : I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLAGPPLGLRLPRETDEESEEPGRRGSF 60 

Qy 61 V^EMVDNLRGKSGQGYWEMTVGSPPQTLNILVDTGSSNFAVG7\APHPFLHRYYQRQLSST 120 

I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 12 0 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 YRDLRKGVTVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I II I I I I I I I I I I I I I I I I I I I I I I I : I I : I I I I I I I I I I I I I : I I I I I I I I I I I I I I 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTEALASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I i I I I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGRLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI SLYLMGEVTNQS FRI T 360 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 VFEAAVKS I KAAS S T EKFP D G FWLGEQLVCWQAGTT PWN I FPVI S L YLMGEVTNQ S FRI T 360 


Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

I I I I I I I I I I I I I I I I II I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 ILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLIWCQW 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I I I I I I I I II I 
Db 481 RCLRCLRHQHDDFADDISLLK 501 


RESULT 4 
Q9ULS1 

ID Q9ULS1 PRELIMINARY; PRT; 532 AA. 

AC Q9ULS1; 

DT 01-MAY-2000 (TrEMBLrel . 13, Created) 

DT 01-OCT-2001 (TrEMBLrel. 18, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Hypothetical protein KIAA1149 (Fragment) . 

GN KIAA1149. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

RX MEDLINE=2 0039618; PubMed= 10574461; 

RA Hirosawa M. , Nagase T., Ishikawa K., Kikuno R. , Nomura N . , Ohara O. ; 

RT "Characterization of cDNA clones selected by the GeneMark analysis 

RT from size-fractionated cDNA libraries from human brain."; 

RL DNA Res. 6:329-336(1999). 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

DR EMBL; AB032975; BAA86463.2; 

DR HSSP; P56272; 1AM5. 

DR InterPro; IPR001461; AspproteaseAl . 

DR InterPro; IPR001969; Aspprotease_site . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 1. 

KW Hypothetical protein; Aspartyl protease; Hydrolase; Protease. 

FT NON_TER 1 1 

SQ SEQUENCE 532 AA; 58720 MW; 98B135D0D5FBD2E8 CRC64; 

Query Match 93.0%; Score 2478.5; DB 4; Length 532; 

Best Local Similarity 96.1%; Pred. No. 2.9e-196; 

Matches 473; Conservative 1; Mismatches 15; Indels 3; Gaps 2; 

Qy 11 WMGAGVLP-AHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRG 69 

I III I I I : I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II 

Db 43 WARECCLPTAPSTASG — CPCAAAWGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRG 100 


Qy 70 KSGQGYWEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVY 12 9 

I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 101 KS GQGY YVEMT VGS P PQTLN I LVDTGS SN FAVGAAPH P FLHRYYQRQL S ST YRDLRKGVY 160 


Qy 130 VPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIAR 18 9 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 161 VPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGIiAYAEIAR 220 

Qy 190 PDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSL 249 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I 

Db 221 PDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSL 280 

Qy 250 WYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSI 309 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 281 WYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSI 340 

Qy 310 KAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRP 369 

I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 341 KAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRP 400 

Qy 37 0 VEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHVHDEFRTA 429 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 4 01 VEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHVHDEFRTA 4 60 

Qy 430 AVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQWRCLRCLRQQ 4 89 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I II II I I I I I I I I I 

Db 461 AVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQWRCLRCLRQQ 520 

r 

Qy 490 HDDFADDISLLK 501 

I I I I I I I I I I I I 
Db 521 HDDFADDISLLK 532 


RESULT 5 
Q8C4F4 
ID 
AC 
DT 
DT 
DT 
DE 


Created) 

Last sequence update) 
Last annotation update) 


Craniata; Vertebrata; Euteleostomi ; 
Sciurognathi ; Muridae; Murinae; Mus . 


Q8C4F4 PRELIMINARY; PRT; 4 67 AA. 

Q8C4F4; 

01-MAR-2003 (TrEMBLrel. 23, 
01-MAR-2003 (TrEMBLrel. 23, 
01-MAR-2003 (TrEMBLrel. 23, 
Beta-site APP cleaving enzyme. 
OS Mus musculus (Mouse) . 
OC Eukaryota; Metazoa; Chordata; 
OC Mammalia; Eutheria; Rodentia; 
OX NCBI_TaxID=10090; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Cerebellum; 
RX MEDLINE=22354 683; PubMed=12466851 ; 
RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

DR EMBL; AK082317; BAC38462.1; -. 

SQ SEQUENCE 467 AA; 52063 MW; 31AB674FF1843652 CRC64; 


Query Match 89.1%; Score 2374; DB 11; 

Best Local Similarity 89.4%; Pred. No. le-187; 
Matches 448; Conservative 7; Mismatches 12; 


Length 4 67; 


Indels 


34; Gaps 1; 


Qy 

Db 


1 


1 


MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

II II I I I I I : I : I : I I I II I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I 
MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLAGPPLGLRLPRETDEESEEPGRRGSF 60 


Qy 61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 61 VEMVDN L RG K S GQ G Y YVEMT VG S P P QT LN I L VDT G S S N FAVGAAP H P FLH RY YQ RQ L S S T 120 

Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVS I PHGPNVTVRANI AAI TES DKFFINGSNWEGI L 180 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVS I PHGPNVTVRANI AAI TES DKFFINGSNWEGI L 180 

Qy 181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I I II I I I I I I I I I I I I I I : I I : I I I II I I I I I I I I : I I I I I I I I I I I I I I 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTEALASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 DH S L YT G S L W YT P I RRE W Y Y E VI I VRVE I N GQ D L KMD C KE 280 

Qy 301 VFEAAVKS I KAASSTEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 281 TEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS FRIT 326 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I II : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 327 ILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 386 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I II M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 387 HVHDEFRTAAVEGPFVT7VDMEDCGYNIPQTDESTLMTIAYVM7VAIC7VLFMLPLCLMVCQW 446 

Qy 481 RCLRCLRQQHDDFADDISLLK 501 

I I 1 I I I 1 I I I I I I I I I I I I I 
Db 447 RCLRCLRHQHDDFADDI SLLK 4 67 


AC 
DT 
DT 
DT 
DE 
DE 
DE 
OS 
OC 
OC 
OX 
RN 
RP 
RC 
RA 
RA 


RESULT 6 
Q9CUU5 

ID Q9CUU5 PRELIMINARY; PRT; 2 67 AA. 

Q9CUU5; 

01-JUN-2001 (TrEMBLrel. 17, Created) 
01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 
01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 
Adult male brain cDNA, RIKEN full-length enriched library, 
clone : 3526402A15 product : beta-site APP cleaving enzyme, full insert 
sequence (Fragment) . 
Mus musculus (Mouse) . 
Eukaryota; Metazoa; Chordata; 

Rodent i a ; 


Mammalia; Eutheria; 
NCBI_TaxID=10090; 
[1] 

SEQUENCE FROM N . A. 
STRAIN=C57BL/6J; TISSUE=Brain; 
Adachi J., Aizawa K., Akahira S 
Arakawa T., Bono H., Carninci P 


Craniata; Vertebrata; Euteleostomi ; 
Sciurognathi; Muridae; Murinae; Mus. 


Akimura T . , Arai A., Aono H., 
Fukuda S., Fukunishi Y. , Furuno M. 


RA Hanagaki T., Hara A., Hayatsu N. , Hiranioto K. , Hiraoka T. , Hori F., 

RA Imotani K., Ishii Y. , Itoh M. , Izawa M., Kasukawa T., Kato H., 

RA Kawai J., Kojima Y., Konno H., Kouda M., Koya S., Kurihara C, 

RA Matsuyama T., Miyazaki A., Nishi K., Nomura K. , Numazaki R. , Ohno M. , 

RA Okazaki Y., Okido T., Owa C, Saito H., Saito R. , Sakai C, Sakai K. , 

RA Sano H. , Sasaki D. , Shibata K. , Shibata Y., Shinagawa A., Shiraki T., 

RA Sogabe Y. , Suzuki H., Tagami M. , Tagawa A. , Takahashi F. , Tanaka T., 

RA Tejima Y. , Toya T., Yamamura T., Yasunishi A., Yoshida K. , Yoshino M. , 

RA Muramatsu M. , Hayashizaki Y. ; 

RL Submitted (JUL-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Brain; 

RX MEDLINE=22354 683; PubMed=124 66851 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,77 0 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Brain; 

RX MEDLINE=21085660; PubMed=11217 8 51 ; 

RA RIKEN FANTOM Consortium; 

RT "Functional annotation of a full-length mouse cDNA collection."; 

RL Nature 4 09:685-690(2001). 

RN [4] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE-Brain; 

RX MEDLINE-99279253; PubMed=l 034 9636; 

RA Carninci P., Hayashizaki Y.; 

RT "High-efficiency full-length cDNA cloning."; 

RL Meth. Enzymol. 303:19-44(1999). 

RN [5] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Brain; 

RX MEDLINE-20499374; PubMed-1 1042 159 ; 

RA Carninci P., Shibata Y., Hayatsu N., Sugahara Y., Shibata K., Itoh M. , 

RA Konno H., Okazaki Y. , Muramatsu M. , Hayashizaki Y. ; 

RT "Normalization and subtraction of cap-trapper-selected cDNAs to 

RT prepare full-length cDNA libraries for rapid discovery of new genes."; 

RL Genome Res. 10:1617-1630(2000). 

RN [6] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Brain; 

RX MEDLINE=20530913; PubMed=110768 61 ; 

RA Shibata K., Itoh M. , Aizawa K., Nagaoka S., Sasaki N., Carninci P., 

RA Konno H., Akiyama J., Nishi K. , Kitsunai T., Tashiro H., Itoh M. , 

RA Sumi N., Ishii Y. , Nakamura S. f Hazama M. , Nishine T., Harada A. , 

RA Yamamoto R. , Matsumoto H., Sakaguchi S., Ikegami T. , Kashiwagi K., 

RA Fujiwake S., Inoue K., Togawa Y. , Izawa M. , Ohara E . , Watahiki M. , 

RA Yoneda Y. , Ishikawa T., Ozawa K., Tanaka T. , Matsuura S., Kawai J., 

RA Okazaki Y., Muramatsu M. , Inoue Y., Kira A., Hayashizaki Y.; 

RT "RIKEN integrated sequence analysis (RISA) system-384-f ormat 

RT sequencing pipeline with 384 multicapillary sequencer."; 

RL Genome Res. 10:1757-1771(2000). 

DR EMBL; AK014390; BAB29317.2; -. 


FT NONJTER 1 1 

SQ SEQUENCE 267 AA; 30333 MW; 94 13EB4 530AB63B0 CRC64; 

Query Match 53.0%; Score 1412; DB 11; Length 267; 

Best Local Similarity 98.9%; Pred. No. 1.6e-108; 

Matches 264; Conservative 1; Mismatches 2; Indels 0; Gaps 0; 

Qy 235 MIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTN 294 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I 1 I I I I 
Db 1 MIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTN 60 

Qy 295 LRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTN 354 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 61 LRLPKKVFEAAVKSIK7VASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTN 120 

Qy 355 QSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIG 414 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 121 QSFRITILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWFDRARKRIG 180 

Qy 415 FAVSACHVlIDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYvWUVICALFMLPLC 474 

I I I I I I I II II I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I II II I I I I I I I I I 

Db 181 FAVS AC HVH D E FRT AAVE G P FVT ADME D C G YN IPQTDEST LMT I AYVMAAI CAL FML PLC 240 

Qy 475 LMVCQWRCLRCLRQQHDDFADDI SLLK 501 

I I I I II I II I I I I I I II I I I I I I I M 
Db 241 LMVCQWRCLRCLRHQHDDFADDISLLK 267 


RESULT 7 
Q8C5E9 

ID Q8C5E9 PRELIMINARY; PRT; 514 AA. 

AC Q8C5E9; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Beta-site APP-cleaving enzyme 2. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC STRAIN-C57BL/ 6J; TISSUE-Testis ; 

RX MEDLINE=22354683; PubMed-12466851 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,77 0 full-length cDNAs . " ; 

RL Nature 420:563-573(2 002). 

DR EMBL; AK078770; BAC37384.1; -. 

SQ SEQUENCE 514 AA; 55811 MW; CBB9237BB68A0B2E CRC64; 

Query Match 43.4%; Score 1156; DB 11; Length 514; 

Best Local Similarity 47.5%; Pred. No. 6.3e-87; 

Matches 235; Conservative 77; Mismatches 149; Indels 34; Gaps 9; 


Qy 2 AQAL PWLLLWMGAGV LP AHGTQHGIRLPLRSGLG — GAP LGLRL 43 


I I I I I :: I I I I I I I III II Ml 

Db 7 ALLLPVLAQWLLSAVPAIiAPAPFTLPLQVARATNH — RASAVPGLGTPGLPRADGLALAX 64 

Qy 4 4 PRETDEEPEEPGR-RGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVG 102 

II I : I : I I I I I : I I I : I I I : I I : I : I I I : I I I I I I I I M I I 

Db 65 EPVRATANFLAMVDNLQGDSGRGYYLEMLIGTPPQKVQILVDTGSSNFAVA 115 

Qy 103 AAPHPFLHRYYQRQLSSTYRDLRKGVWPYTQGKWEGELGTDLVSIPHGPNWVRANIAA 162 

IN:: I : : I I I I I I I I II I I : I I I I : I I I I : I I I 

Db 116 GAPHS YI DT YFDS ES S STYHSKGFDVTVKYTQGSWTGFVGEDLVTI PKGFNSS FLVNIAT 175 

Qy 163 ITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPL 222 

I I I : I I : I I I I I I I I I I : I : I I I I I I I I I I I : I : : I I : : I I I I I : 
Db 176 IFESENFFLPGIKWNGILGLAYAALAKPSSSLETFFDSLVAQAKI PDI FSMQMCGAGLPV 235 

Qy 223 NQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYN 282 

I : | | | : : : I I I : I I I I : I I I I I : MM:: | : : : | | | | : | : I I : I I I 

Db 236 AGS GTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQNLNLDCREYN 2 92 

Qy 283 YDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFP 342 

I I : I I I I I II II I I : I I I : I I : :: I : I I I I I I I I I I III II 

Db 293 ADKAIVDSGTTLLRLPQKVFDAWEAVARTSLIPEFSDGFWTGAQLACWTNSETPWAYFP 352 

Qy 343 VISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGF 4 02 

I I : I I I : : I I I I I I I I I I : : I : : : I I : I I I I : I : I I : I I I I 

Db 353 KI S I YLRDENASRS FRI TI LPQLYIQPMMGAGFN Y- EC YRFGI S S STNALVI GATVMEGF 411 

Qy 403 YWFDRARKRIGFAVSACHVHDEFRTAAVEGPFWLDMEDCGYNIPQTDESTLMTIAYVM 4 62 

I I I I I II :: I : I I I I I I : : : I I I I I : : | I : : | : 

Db 412 YWFDRAQRRVGFAVSPCAEIEGTTVSEISGPFSTEDIASNCVPAQALNEPILWIVSYAL 471 

Qy 4 63 AAICALFMLPLCLMV 4 77 

: : I : I I I : : 
Db 472 MSVCGAI LLVLILLL 4 86 


RESULT 8 
Q9H2V8 

ID Q9H2V8 PRELIMINARY ; PRT; 439 AA. 

AC Q9H2V8; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last annotation update) 

DE CDA13. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI__TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Pheochromocytoma; 

RA Li Y., Huang Q. , Peng, y, Song H., Yu Y., Xu S., Ren S., Chen Z., 

RA Han Z . ; 

RL Submitted (DEC-1999) to the EMBL/ GenBank/DDBJ databases. 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

DR EMBL; AF212252; AAG41783.1; 

DR HSSP; P00797; 2 REN . 


DR InterPro; IPR001461; AspproteaseAl . 

DR InterPro; IPR001969; Aspprotease_site . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Aspartyl protease; Hydrolase; Protease. 

SQ SEQUENCE 439 AA; 48275 MW; 02EC0E0E50F11602 CRC64; 

Query Match 43.4%; Score 1155.5; DB 4; Length 439; 

Best Local Similarity 49.9%; Pred. No. 5.4e-87; 

Matches 219; Conservative 78; Mismatches 135; Indels 7; Gaps 4; 

Qy 63 MVDN L RGK S GQ G Y YVEMT VG S P P QTLN I LVDT G S SN FAVGAAP H P FLH R Y YQ RQL S S T Y R 122 

M I I I : I I I : I I I : I I : I : I I I I I I I I I I I II I I I II I : : I I I I I 

Db 1 MVDNLQGDSGRGYYLEMLI GTPPQKLQI LVDTGS SNFAVAGT PHS YI DT YFDTERS STYR 60 

Qy 123 DLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGL 182 

I I I I I I I I : I I I I : I I I I : I I I I II : I I : I I I I I I I 
Db 61 SKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFESENFFLPGIKWNGILGL 120 

Qy 183 AYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDH 242 

I I I : I : I I I I I I I I II I :: I I : I I I I I I I : I : I I I : : : I I I : 

Db 121 AY AT LAK PSSSLETFFDS L VT Q AN I P N VF S MQMC GAG L P VAG S GTNGGSLVLGGIEP 177 

Qy 243 SLYTGSLWYTPIRREWYYEVI IVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVF 302 

Ml I : I I I II : INI:: I : : : I I II I : I I : I I I I I : I I I I I M I I I I : I I I 
Db 178 SLYKGDIWYT PI KEEWYYQI EI LKLEI GGQSLNLDCREYNADKAIVDSGTTLLRLPQKVF 237 

Qy 303 EAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITIL 362 

: I I : : : II : I I I I I I I I I I I I I : I I I I : I I I : : : I I I I I I I 

Db 238 DAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITIL 297 

Qy 363 PQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHV 422 

I I I : : I : : : I I : I I I I : I : I I : I I I I I I : I I II : I I : I I I I I 

Db 298 PQLYIQPMMGAGLNY-ECYRFGI SPSTNALVI GATVMEGFYVI FDRAQKRVGFAASPCAE 356 

Qy 423 HDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVM7^IC-ALFMLPLCIMVCQWR 4 81 

: : I I I I I : I I : : | : : : | I : : : : I : : : I 

Db 357 IAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCGAILLVLIVLLLLPFR 416 

Qy 4 82 CLRCLRQQHDDFADDISLL 500 

II I : : : I I I 

Db 417 CQR — RPRDPEWNDESSL 433 


RESULT 9 
Q9JL18 

ID Q9JL18 PRELIMINARY; PRT; 514 AA. 

AC Q9JL18; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Aspartyl protease 1. 

GN BACE2 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 


OX NCBI__TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Choi D.K., Sugano S., Sakaki Y.; 

RT "Molecular characterization of the mouse Aspl gene, a homolog of the 

RT human ASP1 (Down Syndrome Region aspartyl protease)."; 

RL Submitted (DEC-1999) to the EMBL/ GenBank/DDBJ databases. 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

DR EMBL; AF216310; AAF36599.1; -. 

DR HSSP; P00797; 2 REN. 

DR MEROPS; A01.041; -. 

DR MGD; MGI: 1860440; B.ace2 . 

DR InterPro; IPR001461; AspproteaseAl . 

DR InterPro; IPR001969; Aspprotease_site . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN . 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Aspartyl protease; Hydrolase; Protease. 

SQ SEQUENCE 514 AA; 55799 MW; A7 0725F2C1DF5B47 CRC64 ; 


Query Match 43.2%; Score 1150; DB 11; Length 514; 

Best Local Similarity 48.3%; Pred. No. 2e-86; 

Matches 224; Conservative 76; Mismatches 144; Indels 20; Gaps 5; 

Qy 14 AGVLPAHGTQHGI RLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGS FVEMVDNLRGKS GQ 73 

I : I II II II II II : I : I I I I I : I II : 

Db 43 ASAVPGLGTP ELPRADGLA LALEPVRAT ANFLAMVDNLQGDSGR 8 6 

Qy 74 GYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVYVPYT 133 

I I I : I I : I : I I I : I I III I I I I I I I III:: I : : I II I I I I I 

Db 87 GYYLEMLI GT PPQKVQILVDTGSSNFAVAGAPHS YI DTYFDS ES S STYHS KGFDVTVKYT 14 6 

Qy 134 QGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDS 193 

I I I I : I I I I : I I I I : I I I I I I : I I : I I I I I I I I I I : I : I I 

Db 147 QGSWTGFVGEDLVTI PKGFNSSFLVNIATIFESENFFLPGIKWNGILGIAYAALAKPSSS 206 

Qy 194 LEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTP 253 

I I I I I I II I : I : : I I : : I I I I I : I : I II : : : I I I : I I I I : I I I I 

Db 207 LET F FD S LVAQAK I P D I F SMQMC GAG L P VAG S GTNGGSLVLGGI EP S LYKGDIWYTP 263 

Qy 254 IRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAAS 313 

I : MM:: I : : : I I MM M I M II I I : I I I I II I I I II : II I : I I : : : I 

Db 264 IKEEWYYQIEILKLEIGGQNLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARTS 323 

Qy 314 STEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDV 37 3 

: I I I II I II I I III II II : II I : M II II II I I MM: 

Db 324 LIPEFSDGFWTGAQLACWTNSETPWAYFPKISIYLRDENASRSFRITILPQLYIQPMMGA 383 

Qy 374 ATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHVHDEFRTAAVEG 433 

: M I : I I I I : I : II : I II I I II II I I : : I M I II I I : : : I 

Db 384 GFNY- EC YRFGI S S STNALVI GATVMEGFYWFDRAQRRVGFAVS PCAEI EGTTVS EI S G 442 

Qy 434 PFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMV 477 

M I M M I :M : :M : I I I : : 

Db 443 PFSTEDIASNCVPAQALNEPILWIVSYALMSVCGAILLVLILLL 48 6 


RESULT 10 
Q8C793 


ID Q8C793 PRELIMINARY; PRT; 514 AA. 

AC Q8C793; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel * 23, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Beta-site APP-cleaving enzyme 2. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID^10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Heart ; 

RX MEDLINE=22354 683; PubMed=124 66851 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 42 0:563-573(2002). 

DR EMBL; AK052309; BAC34 931.1; -. 

SQ SEQUENCE 514 AA; 55871 MW; 8BF45E07B0990225 CRC64; 

Query Match 43.2%; Score 1150; DB 11; Length 514; 

Best Local Similarity 47.3%; Pred. No. 2e-86; 

Matches 233; Conservative 77; Mismatches 153; Indels 30; Gaps 8 

Qy 2 AQALPWLLLWMGAGV LP AHGTQHGIRLPLRSGLGGAPL GLRLPR 4 5 

I I I I I :: I II I I I I III I III 

Db 7 ALLL P VLAQWLL S AVP ALAP AP FT L P LQVARATNH — RASAVPGLGTPELPRADGLALAL 64 

Qy 4 6 ETDEEPEEPGR-RGS FVEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAA 104 

III : I : I I I ! I : I I I : I I I : I I : I : I I I : I I I I I I I I I I I I I 

Db 65 EPVRATANFLAMVDNLQGDSGRGYYLEMLIGTPPQKVQI LVDTGS SNFAVAGA 117 

Qy 105 PHPFLHRYYQRQLSSTYRDLRKGVTVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAIT 164 

II:: I : : I I I I I I I I I I I I : I I I I : I I I I = I I I I 

Db 118 PHSYIDTYFDSESSSTYHSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNSSFLVNIATIF 177 

Qy 165 ESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQ 224 

I I : I I : I I I I I I I I I I : I : I I I I II I I I I I : I : : I I : : I I I I I : 
Db 178 ESENFFLPGIKWNGILGLAYAALAKPSSSLETFFDSLVAQAKIPDIFSMQMCGAGLPVAG 237 

Qy 225 S EVLAS VGGSMI I GGI DHS L YTGS LWYT P I RREWYYEVI I VRVEINGQDLKMDCKEYN YD 284 

I : I I I : : : I I I : III I : I I I I I : MM:: | : : : | | MM : I I : I I I I 

Db 238 S GTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQNLNLDCREYNAD 294 


Qy 285 KS I VDSGTTNLRLPKKVFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI 344 

I : I I I I I I I I I I I : I I I : I I : : : I : I I I I I I I I I I III III 

Db 295 KAI VD S GT T L L RL PQ KVFDAWEAVART SLIPEFSDG FWT GAQLACWTN SET PWAY F P K I 354 

Qy 345 SLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYV 404 

MM I : : I I I I I I I I I : : I : : : I I : I I I I : I : M : I I I I I I 

Db 355 S I YLRDENAS RS FRTTI LPQLYI QPMMGAGFN Y- EC YRFGI S S STNALVI GATVMEGFYV 413 


Qy 


405 VFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAA 4 64 


II I I I • • I • I M M I • 111 I I " * I I 

Db 414 VFD RAQ RRVG FAVS P CAE I E GT TVS EISGPFSTE D I AS N C VP AQALN E P I LW I VS YALM S 473 

Qy 465 ICALFMLPLCLMV 477 

: I : I I I : : 
Db 474 VCGAILLVLILLL 486 


RESULT 11 
Q8N2D4 

ID Q8N2D4 PRELIMINARY; PRT; 423 AA. 

AC Q8N2D4; 

DT 01-OCT-2002 (TrEMBLrel - 22, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Hypothetical protein OVARC1000363 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID-9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Ovarian carcinoma; 

RA Ota T., Nishikawa T., Suzuki Y., Kawai-Hio Y., Hayashi K., Ishii S., 

RA Saito K. , Yamamoto J., Wakamatsu A., Nagai T . , Nakamura Y. , 

RA Nagahari K., Sugano S., Isogai T.; 

RT "HRI human cDNA sequencing project."; 

RL Submitted (MAR-2002) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AK075539; BAC11682-1; 

DR InterPro; IPR001461; AspproteaseAl . 

DR InterPro; IPR001969; Aspprotease_site . 

DR Pfam; PF00026; asp; 2. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Hypothetical protein. 

SQ SEQUENCE 423 AA; 46457 MW; 4D4839F2ED9C2CE1 CRC64 ; 

Query Match 40.3%; Score 1072.5; DB 4; Length 423; 

Best Local Similarity 48.7%; Pred. No. 3.8e-80; 

Matches 206; Conservative 74; Mismatches 136; Indels 7; Gaps 4; 

Qy 79 MTVGS PPQTLNI L VT)TGS SNFAVGAAPHPFLHRYYQRQLS STYRDLRKGVYVP YTQGKWE 138 

I : I : I I I I I I I I I I I I I I I I I I : : I : : I II I I I I I I I I I 

Db 1 MLIGTPPQKLQILVDTGSSNFAVAGTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWT 60 

Qy 139 GELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFF 198 

I : I II I : I I I I : I I I I I I I I : I I I I I I I I II : I : I I I I I I 

Db 61 GFVGEDLVTIPKGFNTSFLVNIATIFESGNFFLPGIQWNGILGLAYATLAKPSSSLETFF 120 

Qy 199 DSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREW 258 

I I I I I : : I I : I I : : I I I I : I : II I : : : I I I : I I I I : I I I I I : I I 

Db 121 DSLVTQANI PNVFSMQMRGAGLPVAGS GTNGGSLVLGGIEPSLYKGDIWYTPIKEEW 177 

Qy 259 YYEVI I VRVEINGQDLKMDCKEYNYDKS IVDSGTTNLRLPKKVFEAAVKS I KAAS STEKF 318 

II:: I : : : I I II I : I I : I I I I I : I I I I I I I I I I I : I I I : I I : : : II : I 
Db 178 YYQIEILKLEIGGQSLNLDCREYN7VDKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEF 237 


Qy 319 PDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQD 37 8 

I I I I I I I I I ill- II I I : I I | : : : M | | | | | | I I : : I : : 

Db 238 SDGFWTGSQLACWTNSET PWSYFPKISI YLRDENSSRSFRITILPQLYIQPMMGAGLNY- 296 

Qy 379 DCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTL 438 

: I I : I I I I : I : I I : I I I I I I : I I I I : I I : I I I I I : : I I I I 

Db 297 ECYRFGISPSTNALVIGATVMEGFYVIFDRAQKRVGFAASPCAEIAGAAVSEISGPFSTE 356 

Qy 439 DMEDCGYNIPQTDESTLMTIAYVMAAIC-ALFMLPLCLMVCQWRCLRCLRQQHDDFADDI 497 

I : I I : : I : : : I I : : : : I : : : I I I I : : : I 

Db 357 DVASNCVPAQSLSEPILWIVSYALMSVCGAILLVLIVLLLLPFRCQR — RPRDPEWNDE 414 

Qy 498 SLL 500 

I I 

Db 415 SSL 417 


RESULT 12 
Q9NZL2 

ID Q9NZL2 PRELIMINARY; PRT; 468 AA. 

AC Q9NZL2; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Aspartyl protease. 

GN BACE2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=2 0422477; PubMed=l 09651 18 ; 

RA Solans A., Estivill X., de La Luna S.; 

RT "A new aspartyl protease on 21q22.3, BACE2, is highly similar to 

RT Alzheimer's amyloid precursor protein beta-secretase . " ; 

RL Cytogenet. Cell Genet. 89:177-184(2000). 

CC - ! - SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

DR EMBL; AF188276; AAF35835.1; 

DR HSSP; P00797; 2 REN . 

DR InterPro; IPR001461; AspproteaseAl . 

DR InterPro; IPR001969; Aspprotease_jsite . 

DR Pfam; PF00026; asp; 1. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Aspartyl protease; Hydrolase; Protease. 

SQ SEQUENCE 468 AA; 50324 MW; 7 17E092 012 6A0142 CRC64; 

Query Match 36.6%; Score 974.5; DB 4; Length 468; 

Best Local Similarity 40.5%; Pred. No. 5.6e-72; 

Matches 210; Conservative 76; Mismatches 150; Indels 83; Gaps 10; 

Qy 2 AQALPWLLLWM GAGVLPAHGTQHGIRLPLRSGLG GAPL: GLR 42 

I I I I I : : I I I I I I I II M 

Db 7 ALLLPLLAQWLLRAAPELAPAPFT LPLRVAAATNRWAPTPGPGTPAERHADGLA 61 

Qy 43 LPRETDEEPEEP GRRG S FVEMVDN L RGK S GQ G Y YVEMT VG S P P QT LN I L VDT G S S N FAVG 102 

I I I : I : I I I II : I I I : I II : I I : I : I I I I I I I I I I I I I I I I 


Db 


62 LALE — PALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVA 119 


Qy 103 AAPHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAA 162 

II:: I : I I I I f I I I I I I I I : I I I I : I I I I : I I I 

Db 120 GTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIAT 17 9 

Qy 163 ITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPL 222 

I II: II: I I I I I I I I I I : I : I III I I I I I I I : : I I : I I ■ : : I I I I I : 
Db 180 IFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV 239 

Qy 223 NQSEVIASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYN 282 

I : I I I : : : I I I : III I : I I I I I : MM:: | : : : | | II I : I I : I I I 

Db 240 AGS GTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYN 296 

Qy 283 YDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFP 342 

I I : I I I I I I I I I I I : I I I : I I : : : II 
Db 297 ADKAIVDSGTTLLRLPQKVFDAWEAVARASLL 32 9 

Qy 343 VISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGF 402 

I : : I : : : I I : I I I I : I : I I : I I I I 

Db 330 YI QPMMGAGLN Y- ECYRFGI S P STNALVI GATVMEGF 365 

Qy 403 YWFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVM 4 62 

I I : I II I : I I : I M I I : : I I I I I : I I : : I : 

Db 366 YVI F D RAQ K RVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C V P AQ SLSEPILWIVS Y AL 425 

Qy 463 AAIC-ALFMLPLCLMVCQWRCLRCLRQQHDDFADDISLL 500 

: : I I : : : : I : : : I I I I : : Mil 
Db 426 MSVCGAILLVLIVLLLLPFRCQR— RPRDPEWNDESSL 4 62 


RESULT 13 
Q9NZL1 

ID Q9NZL1 PRELIMINARY; PRT; 396 AA. 

AC Q9NZL1; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Aspartyl protease. 

GN BACE2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBIJIaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20422477; PubMed-10965118 ; 

RA Solans A., Estivill X., de La Luna S.; 

RT "A new aspartyl protease on 21q22.3, BACE2, is highly similar to 

RT Alzheimer's amyloid precursor protein beta-secretase . " ; 

RL Cytogenet. Cell Genet. 89:177-184(2000). 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

DR EMBL; AF188277; AAF35836.1; -. 

DR HSSP; P00797; 2 REN . 

DR InterPro; IPR001461; AspproteaseAl . 

DR InterPro; IPR001969; Aspprotease_site . 

DR Pfam; PF00026; asp; 1. 


DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE ; 2. 

KW Aspartyl protease; Hydrolase; Protease. 

SQ SEQUENCE 396 AA; 43013 MW; 5023A7AF391CEAC9 CRC64; 

Query Match 36.4%; Score 969.5; DB 4; Length 396; 

Best Local Similarity 49.3%; Pred. No. l.le-71; 

Matches 200; Conservative 56; Mismatches 111; Indels 39; Gaps 9; 

Qy 2 AQAL PWLLLWM GAGVLPAHGTQHGIRLPLRSGLG GAPL GLR 42 

I I I I I : : I I I I I I I II II 

Db 7 ALLL P LLAQWLLRAAPELAPAP FT L P L RVAAATN RWAP T P G P GT P AE RHAD GLA 61 

Qy 43 LPRETDEEPEEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVG 102 

I I I : I : I I I I I : I I I : I II : I I : I : I I I I I I II I I I I I I I I 

Db 62 LALE--PALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVA 119 

Qy 103 AAPHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAA 162 

II:: I : : I I I I I I I I I I I I I : I I I I : I I I I : III 

Db 120 GTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIAT 179 

Qy 163 ITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPL 222 

I II: II: I I I I I I I I I I : I : I III I I I I I I I : : I I : I I : : I I I I I *• 

Db 180 IFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV 239 

Qy 223 NQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYN 282 

I : I I I : : : I I I : III I : I I I I I : MM:: | : : : | | M I : I I : I II 

Db 240 AGS GTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYN 296 

Qy 283 YDKSIVDSGTTNLRLPKKVFEuAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFP 342 

I I : I I I I I I I I I I I : I I I : I I :: : II : I I I I I I I I I I I I I : I I 

Db 297 ADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFP 356 

Qy 343 VISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKF-AISQ 387 

I I : I I I : : : I I I I I I I I I : II : : III : I I 

Db 357 KISIYLRDENSSRSFRITILPQK-LRVLQ CLKFPGLSQ 393 


RESULT 14 
Q9P0D2 

ID Q9P0D2 PRELIMINARY; PRT; 213 AA. 

AC Q9P0D2; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-JUN-2002 (TrEMBLrel. 21, Last annotation update) 

DE HSPC104 (Fragment) . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Umbilical cord blood; 

RA Zhang Q.H., Ye M., Zhou J., Shen Y., Wu X.Y., Guan Z.Q., Wang L., 

RA Fan H.Y., Mao Y.F., Dai M. , Huang Q.H., Chen S.J., Chen Z.; 

RT "Human partial CDS cloned from cd34 + stem cells."; 

RL Submitted (MAY-1999) to the EMBL/GenBank/DDB J databases. 


DR EMBL; AF161367; AAF28927.1; -. 

DR InterPro; IPR001461; AspproteaseAl . 

DR Pfam; PF00026; asp; 1. 

FT NONJTER 1 1 

SQ SEQUENCE 213 AA; 24338 MW; EC9D3FA31CFA835C CRC64; 


Query Match 26.7%; Score 712.5; DB 4; Length 213; 

Best Local Similarity 83.5%; Pred. No. 7.9e-51; 

Matches 137; Conservative 4; Mismatches 12; Indels 11; 


Gaps 


l; 


Qy 


Db 


23 8 GGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRL 297 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

1 GGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRL 60 


QY 
Db 

Qy 

Db 


298 PKKVFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS F 357 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

61 PKKVFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS F 120 

358 RITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEG 401 

II I I I I I I I I I I : I : I I I : : I 

121 RITILPQQYLRP WKMWPRPKTTVTVCHLTVIHG 153 


RESULT 15 
Q9R1P7 

ID Q9R1P7 PRELIMINARY; PRT; 255 AA. 

AC Q9R1P7; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Aspartyl protease (Fragment) . 

GN BACE2 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBIJTaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Accarino M. , Fumagalli P., Taramelli R. , Ottolenghi S.; 

RT "Cloning of a gene from chromosome 21 Down Region encoding a potential 

RT transmembrane protease."; 

RL Submitted (FEB-1998) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AF051150; AAD45964.1; -. 

DR MEROPS; A01.041; -. 

DR MGD; MGI: 1860440; Bace2 . 

DR InterPro; IPR001969; Aspprotease_site . 

DR PROSITE; PS00141; ASP_PROTEASE; 1. 

KW Protease. 

FT NON_TER 1 1 

SQ SEQUENCE 255 AA; 28685 MW; 53DE317815996D63 CRC64 ; 

Query Match 22.4%; Score 596.5; DB 11; Length 255; 

Best Local Similarity 47.8%; Pred. No. 4e-41; 

Matches 109; Conservative 44; Mismatches 74; Indels 1; Gaps 1; 


QY 


250 WYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSI 309 
Mill: MM:: | : : : | | I I : I : || : II I I I : II I M II II I I : I II : I I : : : 


Db 


1 WYTPIKEEWYYQIEILKLEIGGQNLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAV 60 


Qy 310 KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI SLYLMGEVTNQS FRI T I LPQQYLRP 369 

I : I I I I I I I I I I III 1111:11 I : : I I I I I I I I I I : : I 

Db 61 ARTSLIPEFSDGFWTGAQLACWTNSETPWAYFPKISIYLRDENASRSFRITILPQLYIQP 120 

Qy 370 VEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSACHVHDEFRTA 429 

: : : I I : | | | | : I : I I : I I I I I I I I I I I :: I : I I I I I I : : 

Db 121 MMGAGFN Y- ECYRFGI S S STNALVI GATVMEGFYWFDRAQRRVGFAVS PCAEI EGTTVS 179 

Qy 430 AVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMV 477 

: I I I I I : : I I : : I : : : I : I I I : : 

Db 18 0 EISGPFSTEDIASNCVPAQALNEPILWIVSYALMSVCGAILLVLILLL 227 


Search completed: January 21, 2004, 09:25:07 
Job time : 106.457 sees 


GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 


OM protein - protein search, using sw model 
Run on : 


Title: 
Perfect score: 2664 


January 21, 2004, 09:15:44 ; Search time 24.9063 Seconds 

(without alignments) 
945.960 Million cell updates/sec 

US-09-869-414A-4 


Sequence : 


1 MAQ AL P W L L L WMGAGVL PAH CLRCLRQQHDDFADDISLLK 501 


Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 127863 seqs, 47026705 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 


127863 


Database 


SwissProt 41:* 


Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 


SUMMARIES 


Result 


Query 


No. 

Score 

Match 

Length 

DB 

ID 


Description 

1 

2656 

99. 

7 

501 

1 

BACE 

_HUMAN 

P56817 

homo sapien 

2 

2569 

96. 

4 

501 

1 

BACE_ 

~RAT 

P56819 

rattus norv 

3 

2567 

96. 

4. 

501 

1 

BACE_ 

MOUSE 

P56818 

mus musculu 

4 

1173.5 

44. 

1 

518 

1 

BAE2^ 

HUMAN 

Q9y5z0 

homo sapien 

5 

330 

12. 

4 

324 

1 

PEP1_ 

_GADMO 

P56272 

gadus morhu 

6 

314.5 

11. 

8 

390 

1 

CATD_ 

BOVIN 

P80209 

bos taurus 

7 

309 

11. 

6 

387 

1 

PEP1_ 

RABIT 

P28712 

oryctolagus 

8 

307.5 

11. 

5 

388 

1 

PEP4~ 

_MACFU 

P27678 

macaca fuse 

9 

305 

11. 

4 

367 

1 

PEPA_ 

_CHICK 

P00793 

gallus gall 

10 

301.5 

11. 

3 

383 

1 

pepe" 

_CHICK 

P16476 

gallus gall 

11 

301.5 

11. 

3 

396 

1 

CATE_ 

HUMAN 

P14091 

homo sapien 

12 

300.5 

11. 

3 

412 

1 

CATD 

HUMAN 

P07339 

homo sapien 

13 

299 

11. 

2 

387 

1 

PEP2~ 

RABIT 

P27821 

oryctolagus 

14 

298 

11. 

2 

387 

1 

PEP4~ 

_RABIT 

P28713 

oryctolagus 

15 

297 

11. 

1 

407 

1 

catd" 

_RAT 

P24268 

rattus norv 

16 

295 

11. 

1 

391 

1 

CATE~ 

CAVPO 

P25796 

cavia porce 

17 

294.5 

11. 

1 

388 

1 

PEP2~ 

MACFU 

P27677 

macaca fuse 


18 

289 

10 . 

8 

387 

1 

PEP3 

RABIT 

PZ / oZZ 

oryctolagus 

19 

288 . 5 

10 . 

8 

388 

1 

PEPA 

HUMAN 

n r\ r\ *"7 (\ A 

puu / y u 

homo sapien 

20 

288 . 5 

10 . 

8 

388 

1 

PEPA 

MACMU 

pi 14 o y 

macaca inula 

21 

288 . 5 

10 . 

8 

398 

1 

CATE 

RAT 

P16228 

rattus norv 

22 

287 

10 . 

8 

410 

1 

CATD 

MOUSE 

■pi i o O A O 

PI 8242 

mus musculu 

23 

286. 5 

10 . 

8 

388 

1 

PEP1 

MACFU 

P03954 

macaca fuse 

24 

286 

10 . 

7 

398 

1 

CATD 

CHICK 

QOo /4 4 

gallus gall 

25 

284 . 5 

10 . 

7 

381 

1 

CHYM 

SHEEP 

P182 lb 

ovis aries 

26 

281. 5 

10 . 

6 

386 

1 

PEPA 

PIG 

P00791 

sus scrofa 

27 

281 

10 . 

5 

387 

1 

PEPA_ 

CALJA 

Q9n2d4 

callithrix 

28 

280. 5 

10 . 

5 

396 

1 

CATD 

CLUHA 

Q9dex3 

clupea hare 

29 

280. 5 

10 . 

5 

397 

1 

CATE 

MOUSE 

t~\ *~t r\ o f~ c\ 

P70269 

mus musculu 

30 

276.5 

10 . 

4 

381 

1 

CHYM_ 

BOVIN 

P00794 

bos taurus 

31 

276. 5 

10 . 

4 

396 

1 

CATE 

RABIT 

P43159 

oryctolagus 

32 

274.5 

10 . 

3 

419 

1 

CARV 

CANAL 

P10977 

Candida alb 

33 

273. 5 

10 . 

3 

376 

1 

PAG2 

BOVIN 

Q28057 

bos taurus 

34 

273. 5 

10 . 

3 

377 

1 

PEPC 

MACFU 

P03955 

macaca fuse 

35 

273 

10 . 

2 

388 

1 

PEPF_ 

RABIT 

P27823 

oryctolagus 

36 

270. 5 

10. 

2 

381 

1 

CHYM 

CALJA 

Q9n2d2 

callithrix 

37 

268 

10. 

1 

396 

1 

CARP_ 

NEUCR 

Q01294 

neurospora 

38 

267 

10. 

0 

365 

1 

CATD 

SHEEP 

Q9mzs8 

ovis aries 

39 

266. 5 

10. 

0 

388 

1 

PEPC_ 

CALJA 

Q9n2d3 

callithrix 

40 

266 

10. 

0 

394 

1 

PEPC^ 

CAVPO 

Q64411 

cavia porce 

41 

266 

10. 

0 

405 

1 

CARP_ 

YEAST 

P07267 

saccharomyc 

42 

264.5 

9. 

9 

388 

1 

PEPC 

HUMAN 

P20142 

homo sapien 

43 

264 

9. 

9 

388 

1 

PAG HORSE 

Q28389 

equus cabal 

44 

262 

9. 

8 

496 

1 

ASPR_ 

_ORYSA 

P42211 

oryza sativ 

45 

261.5 

9. 

8 

387 

1 

ASPP 

AEDAE 

Q03168 

aedes aegyp 



ALIGNMENTS 


RESULT 1 


BACE 

HUMAN 


ID 

BACE HUMAN STANDARD; PRT; 501 AA. 


AC 

P56817; Q9BYB9; Q9BYC0; Q9BYC1; Q9UJT5; 


DT 

30-MAY-2000 (Rel. 39, Created) 


DT 

30-MAY-2000 (Rel. 39, Last sequence update) 


DT 

15-SEP-2003 (Rel. 42, Last annotation update) 


DE 

Beta-secretase precursor (EC 3.4.23.-) (Beta-site APP cleaving en 

zyme) 

DE 

(Beta-site amyloid precursor protein cleaving enzyme) (Aspartyl 


DE 

protease 2) (Asp 2) (ASP2) (Membrane-associated aspartic protease 

2) 

DE 

(Memapsin-2 ) . 


GN 

BACE OR BACE1. 


OS 

Homo sapiens (Human) . 


OC 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 


OC 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 


OX 

NCBI TaxID=9606; 


RN 

[1] 


RP 

SEQUENCE FROM N.A. (ISOFORM A) . 


RC 

TISSUE=Brain; 


RX 

MEDLINE=20002972; PubMed= 1.053 1052 ; 


RA 

Vassar R. , Bennett B.D., Babu-Khan S., Kahn S., Mendiaz E. A. , 


RA 

Denis P., Teplow D.B., Ross S., Amarante P., Loeloff R., Luo Y . , 


RA 

Fisher S., Fuller J., Edenson S., Lile J., Jarosinski M.A., 


RA 

Biere A.L., Curran E . , Burgess T . , Louis J.-C, Collins F. , 



RA Treanor J., Rogers G. , Citron M. ; 

RT "Beta-secretase cleavage of Alzheimer's amyloid precursor protein by 

RT the transmembrane aspartic protease BACE. " ; 

RL Science 2 8 6:735-741(1999). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORM A), SEQUENCE OF 46-68, AND 

RP CHARACTERIZATION. 

RC TISSUE=Brain; 

RX MEDLINE=20057171; PubMed=105912 14 ; 

RA Sinha S., Anderson J. P., Barbour R., Basi G.S., Caccavello R. , 

RA Davis D. f Doan M. , Dovey H.F., Frigon N., Hong J. , Jacobson-Croak K. , 

RA Jewett N., Keim P., Knops J. , Lieberburg I., Power M . , Tan H., 

RA Tatsuno G. , Tung J., Schenk D., Seubert P., Suomensaari S.M., Wang S., 

RA Walker D., Zhao J., McConlogue L., Varghese J.; 

RT "Purification and cloning of amyloid precursor protein beta-secretase 

RT from human brain."; 

RL Nature 402:537-540(1999). 

RN [3] 

RP SEQUENCE FROM N.A. (ISOFORM A) . 

RX MEDLINE=20057170; PubMed=105912 13 ; 

RA Yan R., Bienkowski M.J., Shuck M. E. , Miao H., Tory M.C., Pauley A.M., 

RA Brashier J.R., Stratman N.C. f Mathews W.R., Buhl A.E., Carter D.B., 

RA Tomasselli A.G., Parodi L.A., Heinrikson R.L., Gurney M.E.; 

RT "Membrane-anchored aspartyl protease with Alzheimer's disease beta- 

RT secretase activity."; 

RL Nature 4 02:533-537(1999). 

RN [4] 

RP SEQUENCE FROM N.A. (ISOFORM A) . 

RX MEDLINE=20120043; PubMed^l 0 656250 ; 

RA Hussain I., Powell D.J., Howlett D.R., Tew D.G., Meek T.D., 

RA Chapman C, Gloger I.S., Murphy K.E. f Southan CD., Ryan D.M., 

RA Smith T.S., Simmons D.L., Walsh F.S., Dingwall C. , Christie G. ; 

RT "Identification of a novel aspartic proteinase (Asp 2) as beta- 

RT secretase . " ; 

RL Mol. Cell. Neurosci. 14:419-427(1999). 

RN [5] 

RP SEQUENCE FROM N.A. (ISOFORM B) . 

RC TISSUE=Brain, and Pancreas; 

RA Michel B., De Pietri Tonelli D., Zacchetti D., Keller P.; 

RT "New beta-site APP cleaving enzyme isoform (BACE- IB) obtained from 

RT human brain and pancreas."; 

RL Submitted (JAN-2001) to the EMBL/ GenBank/DDBJ databases. 

RN [6] 

RP SEQUENCE FROM N.A. (ISOFORM C) . 

RC TISSUE=Pancreas; 

RA Zacchetti D., De Pietri Tonelli D . , Schnurbus R. ; 

RT "New beta-site APP cleaving enzyme isoform (BACE-1C) obtained from 

RT human pancreas."; 

RL Submitted (JAN-2001) to the EMBL/ GenBank/DDBJ databases. 

RN [7] 

RP SEQUENCE FROM N.A. (ISOFORMS B; C AND D) . 

RC TISSUE=Brain; 

RX MEDLINE=21408467; PubMed=11516562 ; 

RA Tanahashi H., Tabira T.; 

RT "Three novel alternatively spliced isoforms of the human beta-site 

RT amyloid precursor protein cleaving enzyme (BACE) and their effect on 

RT amyloid beta-peptide production."; 


RL Neurosci. Lett. 307:9-12(2001). 

RN [8] 

RP SEQUENCE OF 14-501 FROM N.A. (ISOFORM A), AND CHARACTERIZATION. 

RX MEDLINE=20144060; PubMed=106774 8 3 ; 

RA Lin X., Koelsch G., Wu S., Downs D. , Dashti A,, Tang J.; 

RT "Human aspartic protease memapsin 2 cleaves the beta-secretase site of 

RT beta-amyloid precursor protein."; 

RL Proc. Natl. Acad. Sci. U.S.A. 97:1456-1460(2000). 

RN [9] 

RP DISULFIDE BONDS. 

RX MEDLINE=21950860; PubMed=11953458 ; 

RA Fischer F. , Molinari M. , Bodendorf U., Paganetti P.; 

RT "The disulphide bonds in the catalytic domain of BACE are critical but 

RT not essential for amyloid precursor protein processing activity."; 

RL J. Neurochem. 80:1079-1088(2002). 

CC -!- FUNCTION: RESPONSIBLE FOR THE PROTEOLYTIC PROCESSING OF THE 
CC AMYLOID PRECURSOR PROTEIN (APP) . CLEAVES AT THE AMINO TERMINUS OF 

CC THE A- BETA PEPTIDE SEQUENCE, BETWEEN RESIDUES 671 AND 672 OF APP, 

CC LEADS TO THE GENERATION AND EXTRACELLULAR RELEASE OF BETA-CLEAVED 

CC SOLUBLE APP, AND A CORRESPONDING CELL-ASSOCIATED CARBOXY-TERMINAL 

CC FRAGMENT WHICH IS LATER RELEASE BY GAMMA- S EC RETAS E . 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=4; 

CC Name=A; Synonyms=BACE-lA, BAC-501; 

CC IsoId=P56817-l; Sequence^Displayed; 

CC Name=B; Synonyms=BACE-lB, BACE-I-476; 

CC IsoId=P56817-2; Sequence=VSP_005223 ; 

CC Name^C; Synonyms =BACE- 1C, BACE-I-457; 

CC IsoId=P56817-3; Sequence=VSP_005222 ; 

CC Name=D; Synonyms=BACE-lD, BACE-I-432; 

CC IsoId=P56817-4; Sequence=VSP_005222 , VSP_005223; 

CC -!- TISSUE SPECIFICITY: BRAIN. 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF190725; AAF04142.1; 

DR EMBL; AF201468; AAF18982.1; 

DR EMBL; AF200343; AAF17079.1; -. 

DR EMBL; AF204943; AAF26367.1; -. 

DR EMBL; AF338816; AAK38374.1; -. 

DR EMBL; AF338817; AAK38375.1; -. 

DR EMBL; AB050436; BAB40931.1; -. 

DR EMBL; AB050437; BAB40932.1; -. 

DR EMBL; AB050438; BAB40933.1; 

DR EMBL; AF200193; AAF13715.1; 

DR PIR; A59090; A59090. 

DR PDB; 1M4H; 28-AUG-02. 

DR MEROPS; A01.004; -. 

DR Genew; HGNC:933; BACE. 


DR MIM; 604252; -. 

DR GO; GO: 0005887; C: integral to plasma membrane; TAS . 

DR GO; GO: 0008798; F: beta-aspartyl-peptidase activity; TAS. 

DR GO; GO: 0009405; P : pathogenesis ; TAS. 

DR GO; GO:0006508; P : proteolysis and peptidolysis; TAS. 

DR InterPro; IPR001969; Aspprotease_site . 

DR InterPro; IPR001461; AspproteaseAl . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 1. 

KW Hydrolase; Aspartyl protease; Glycoprotein; Zymogen; Transmembrane; 


KW 

Signal; Alternative splicing; 

3D-structure . 

FT 

SIGNAL 

1 

21 

POTENTIAL. 

FT 

PROPEP 

22 

45 


FT 

CHAIN 

46 

501 

BETA-SECRETASE. 

FT 

DOMAIN 

22 

457 

EXTRACELLULAR (POTENTIAL) . 

FT 

TRANSMEM 

458 

478 

POTENTIAL. 

FT 

DOMAIN 

479 

501 

CYTOPLASMIC (POTENTIAL) . 

FT 

ACT SITE 

93 

93 

BY SIMILARITY. 

FT 

ACT__SITE 

289 

289 

BY SIMILARITY. 

FT 

DISULFID 

216 

420 


FT 

DISULFID 

278 

443 


FT 

DISULFID 

330 

380 


FT 

CARBOHYD 

153 

153 

N-LINKED ( GLCNAC . . .) (POTENTIAL). 

FT 

CARBOHYD 

172 

172 

N-LINKED (GLCNAC. . .) (POTENTIAL). 

FT 

CARBOHYD 

223 

223 

N-LINKED (GLCNAC. . .) (POTENTIAL). 

FT 

CARBOHYD 

354 

354 

N-LINKED (GLCNAC. . .) ( POTENTIAL) . 

FT 

VARSPLIC 

146 

189 

Missing (in isoform C and isoform D) 

FT 




/FTId=VSP_005222 . 

FT 

VARSPLIC 

190 

214 

Missing (in isoform B and isoform D) 

FT 




/FTId=VSP 005223. 

SQ 

SEQUENCE 

501 AA; 

55763 MW; 

377CE4C824ACEF05 CRC64; 


Query Match 99.7%; 
Best Local Similarity 99.8%; 
Matches 500; Conservative 


Score 2656; DB 1; Length 501; 
Pred. No. 2.7e-206; 
0; Mismatches 1; Indels 0; 


Gaps 


0; 


Qy 

Db 

Qy 

Db 

Qy 

Db 


1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I E I I I I M I I I I I I I M 
1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 


60 


60 


120 


61 VEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSST 

I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I > I I I 

61 VEMVDNLRGKSGQGY WEMTVGS PPQTLNI LVDTGS SNFAVGAAPHPFLHRYYQRQLS ST 12 0 

121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 


Qy 

Db 

Qy 

Db 


181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 GLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGI 240 

241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I | | | | I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
241 DHSLYTGSLWYTPI RREWYYEVI IVRVEINGQDLKMDCKEYNYDKS IVDSGTTNLRLPKK 300 


Qy 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

Db 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

Qy 3 61 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 42 0 

I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

Db 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I I I I I I I I I I I I I I I I I I I 

Db 481 RCLRCLRQQHDDFADDI SLLK 501 

RESULT 2 
BACE_RAT 

ID BACE_RAT STANDARD; PRT; 501 AA. 

AC P56819; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Beta-secretase precursor (EC 3.4.23.-) (Beta-site APP cleaving enzyme) 

DE (Beta-site amyloid precursor protein cleaving enzyme) (Aspartyl 

DE protease 2) (Asp 2) (ASP2) (Membrane-associated aspartic protease 2) 

DE (Memapsin-2) . 

GN BACE . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20002972; PubMed-10531052 ; 

RA Vassar R. , Bennett B.D., Babu-Khan S., Kahn S., Mendiaz E.A. , 

RA Denis P., Teplow D.B., Ross S., Amarante P., Loeloff R. , Luo Y. , 

RA Fisher S., Fuller J., Edenson S., Lile J., Jarosinski M.A., 

RA Biere A.L., Curran E., Burgess T., Louis J.-C, Collins F., 

RA Treanor J., Rogers G., Citron M. ; 

RT "Beta-secretase cleavage of Alzheimer's amyloid precursor protein by 

RT the transmembrane aspartic protease BACE."; 

RL Science 286:735-741(1999). 

CC -!- FUNCTION: RESPONSIBLE FOR THE PROTEOLYTIC PROCESSING OF THE 

CC AMYLOID PRECURSOR PROTEIN (APP) . CLEAVES AT THE AMINO TERMINUS OF 

CC THE A-BETA PEPTIDE SEQUENCE, BETWEEN RESIDUES 671 AND 672 OF APP, 

CC LEADS TO THE GENERATION AND EXTRACELLULAR RELEASE OF BETA-CLEAVED 

CC SOLUBLE APP, AND A CORRESPONDING CELL- ASSOCIATED CARBOXY- TERMINAL 

CC FRAGMENT WHICH IS LATER RELEASE BY GAMMA- SECRET AS E (BY 

CC SIMILARITY) . 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 


CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF190727; AAF04144.1; -. 

DR HSSP; P32329; 1YPS. 

DR MEROPS; A01.004; -. 

DR InterPro; IPR001969; Aspprotease_site . 

DR InterPro; IPR001461; AspproteaseAl . 

DR Pfam; PF00026; asp; 1. 

DR PROSITE; PS00141; ASP_PROTEASE; 1. 

KW Hydrolase; Aspartyl protease; Glycoprotein; Zymogen; Transmembrane; 


KW 

Signal . 




FT 

SIGNAL 

1 

21 

POTENTIAL. 

FT 

PROPEP 

22 

45 

POTENTIAL. 

FT 

CHAIN 

46 

501 

BETA-SECRETASE. 

FT 

DOMAIN 

22 

457 

EXTRACELLULAR (POTENTIAL) . 

FT 

TRANSMEM 

458 

478 

POTENTIAL. 

FT 

DOMAIN 

479 

501 

CYTOPLASMIC (POTENTIAL) . 

FT 

ACT_SITE 

93 

93 

BY SIMILARITY. 

FT 

ACT_SITE 

289 

289 

BY SIMILARITY. 

FT 

DISULFID 

216 

420 

BY SIMILARITY. 

FT 

DISULFID 

278 

443 

BY SIMILARITY. 

FT 

DISULFID 

330 

380 

BY SIMILARITY. 

FT 

CARBOHYD 

153 

153 

N-LINKED (GLCNAC. . .) ( POTENTIAL) 

FT 

CARBOHYD 

172 

172 

N-LINKED (GLCNAC. . .) (POTENTIAL) 

FT 

CARBOHYD 

■ 223 

223 

N-LINKED (GLCNAC. . .) (POTENTIAL) 

FT 

CARBOHYD 

354 

354 

N-LINKED (GLCNAC. . .) ( POTENTIAL) 

SQ 

SEQUENCE 

501 AA; 

55806 

MW; 24B445BC8BE87DE3 CRC64; 

Query Match 


96.4^ 

h; Score 2569; DB 1; Length 501; 

Best Local Similarity 

96.2^ 

h; Pred. No. 2.7e-199; 

Matches 482; 

Conservative 

7; Mismatches 12; Indels 0; 


Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 

II II | | | | | : | : | : I M II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MAPALRWLLLWVGSGMLPAQGTHLGIRLPLRSGLAGPPLGLRLPRETDEEPEEPGRRGSF 60 

Qy 61 VEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPHPFLHRYYQRQLS ST 12 0 



Db 


61 VEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPHPFLHRYYQRQLS ST 12 0 


Db 


121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
121 YRDLRKSVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 18 0 


QY 


181 GLAYAEI ARPDDS LEP FFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMI I GGI 24 0 




Db 


181 GLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTEALASVGGSMIIGGI 240 


Qy 


Db 


241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I II I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I 
241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 


301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 


Db 


301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 


Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I II I 

Db 361 ILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I M I II I I I I I I 

Db 421 HVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

Qy 4 81 RCLRCLRQQHDDFADDISLLK 501 

I I I I I I I I I I I I I I I I I I I I 
Db 481 RCLRCLRHQHDDFADDI SLLK 501 


RESULT 3 
BACE_MOUSE 

ID BACE_MOUSE STANDARD; PRT; 501 AA. 

AC P56818; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Beta-secretase precursor (EC 3.4.23.-) (Beta-site APP cleaving enzyme) 

DE (Beta-site amyloid precursor protein cleaving enzyme) (Aspartyl 

DE protease 2) (Asp 2) (ASP2) (Membrane-associated aspartic protease 2) 

DE (Memapsin-2) . 

GN BACE . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20002972; PubMed=^10531 052 ; 

RA Vassar R. , Bennett B.D., Babu-Khan S., Kahn S., Mendiaz E.A., 

RA Denis P., Teplow D.B., Ross S., Amarante P., Loeloff R., Luo Y., 

RA Fisher S., Fuller J., Edenson S., Lile J., Jarosinski M.A. , 

RA Biere A.L., Curran E., Burgess T., Louis J.-C, Collins F., 

RA Treanor J., Rogers G. , Citron M. ; 

RT "Beta-secretase cleavage of Alzheimer's amyloid precursor protein by 

RT the transmembrane aspartic protease BACE."; 

RL Science 2 86:735-741(1999). 

RN [2] 

RP REVISIONS TO 6 AND 81-87. 

RA Bennett B.D., Vassar R., Citron M. ; 

RL Submitted (JAN-2000) to the EMBL/ GenBank/DDB J databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-20057170; PubMed=105912 13 ; 

RA Yan R. , Bienkowski M.J., Shuck M.E., Miao H . , Tory M.C., Pauley A.M., 

RA Brashier J.R., Stratman N.C., Mathews W.R., Buhl A.E., Carter D.B., 

RA Tomasselli A.G., Parodi L.A., Heinrikson R.L., Gurney M.E.; 

RT "Membrane-anchored aspartyl protease with Alzheimer's disease 

RT beta-secretase activity."; 

RL Nature 402:533-537(1999). 

CC -!- FUNCTION: RESPONSIBLE FOR THE PROTEOLYTIC PROCESSING OF THE 

CC AMYLOID PRECURSOR PROTEIN (APP) . CLEAVES AT THE AMINO TERMINUS OF 


cc 
cc 
cc 
cc 
cc 


THE A-BETA PEPTIDE SEQUENCE, BETWEEN RESIDUES 671 AND 672 OF APP, 
LEADS TO THE GENERATION AND EXTRACELLULAR RELEASE OF BETA-CLEAVED 
SOLUBLE APP, AND A CORRESPONDING CELL-ASSOCIATED CARBOXY- TERMINAL 
FRAGMENT WHICH IS LATER RELEASE BY GAMMA- S EC RETASE (BY 


SIMILARITY) . 


CC -!- SUBCELLULAR LOCATION: Type I membrane protein. 
CC TISSUE SPECIFICITY: BRAIN. 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC : 

DR EMBL; AF190726; AAF04143.2; -. 

DR EMBL; AF20034 6; AAF17082.1; -. 

DR HSSP; P56272; 1AM5 . 

DR MEROPS; A01.004; -. 

DR MGD; MGI: 1346542; Bace. 

DR InterPro; IPR001969; Aspprotease_site . 

DR InterPro; IPR001461; AspproteaseAl . 

DR Pfam; PF00026; asp; 1. 

DR PROSITE; PS00141; ASP_PROTEASE; 1. 

KW Hydrolase; Aspartyl protease; Glycoprotein; Zymogen; Transmembrane; 


KW 

Signal . 




FT 

SIGNAL 

1 

21 

POTENTIAL. 

FT 

PROPEP 

22 

45 

POTENTIAL. 

FT 

CHAIN 

46 

501 

BETA-SECRETASE. 

FT 

DOMAIN 

22 

457 

EXTRACELLULAR (POTENTIAL) . 

FT 

TRANSMEM 

458 

478 

POTENTIAL. 

FT 

DOMAIN 

479 

501 

CYTOPLASMIC (POTENTIAL) . 

FT 

ACT SITE 

93 

93 

BY SIMILARITY. 

FT 

ACT_SITE 

289 

289 

BY SIMILARITY. 

FT 

DISULFID 

216 

420 

BY SIMILARITY. 

FT 

DISULFID 

278 

443 

BY SIMILARITY. 

FT 

DISULFID 

330 

380 

BY SIMILARITY. 

FT 

CARBOHYD 

153 

153 

N-LINKED (GLCNAC. . . ) (POTENTIAL) 

FT 

CARBOHYD 

172 

172 

N-LINKED (GLCNAC. . .) (POTENTIAL) 

FT 

CARBOHYD 

223 

223 

N-LINKED (GLCNAC. . .) (POTENTIAL) 

FT 

CARBOHYD 

354 

354 

N-LINKED (GLCNAC. . .) (POTENTIAL) 

SQ 

SEQUENCE 

501 AA; 

55747 

MW; C085A013145E474E CRC64; 


Query Match 96.4%; Score 2567; DB 1; Length 501; 

Best Local Similarity 96.2%; Pred. No. 4e-199; 

Matches 482; Conservative 7; Mismatches 12; Indels 0; Gaps 0; 
Qy 1 MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSF 60 




Db 


1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLAGPPLGLRLPRETDEESEEPGRRGSF 60 


Qy 

Db 


61 VEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPHP FLHRYYQRQLS ST 12 0 

I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
61 VEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPHP FLHRYYQRQLS ST 12 0 


Qy 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

I I I I I I I I I I I I I I II I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I 
Db 121 YRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGIL 180 

Qy 181 GLAY7VEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGI 240 

I I I I I I I I I II I I I I I I I I I I I I I I I : I I : I I I 1111111111:1 I I I I I I I I I I I M 

Db 181 GLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTEALASVGGSMIIGGI 240 

Qy 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 DHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKK 300 

Qy 301 VFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 301 VFEuAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRIT 360 

Qy 361 ILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

I I I I I I I I I I I I I I I I I I I I I I II : I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 361 ILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWFDRARKRIGFAVSAC 420 

Qy 421 HVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 HVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAAI CALFMLPLCLMVCQW 4 80 

Qy 481 RCLRCLRQQHDDFADDI SLLK 501 

I I I II I I I I I I I I I I I I I I I 
Db 481 RCLRCLRHQHDDFADDI S LLK 501 


RESULT 4 
BAE2_HUMAN 

ID BAE2__HUMAN STANDARD; PRT; 518 AA. 

AC Q9Y5Z0; Q9UJT6; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE Beta secretase 2 precursor (EC 3.4.23.-) (Beta-site APP-cleaving 

DE enzyme 2) (Aspartyl protease 1) (Asp 1) (ASP1) (Membrane-associated 

DE aspartic protease 1) (Memapsin-1) . 

GN BACE2 OR ASP21. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20057170; PubMed=10591213 ; 

RA Yan R., Bienkowski M.J., Shuck M.E., Miao H., Tory M.C., Pauley A.M., 

RA Brashier J.R., Stratman N.C., Mathews W.R., Buhl A.E., Carter D.B., 

RA Tomasselli A.G., Parodi L.A. , Heinrikson R.L., Gurney M.E.; 

RT "Membrane-anchored aspartyl protease with Alzheimer f s disease 

RT beta-secretase activity."; 

RL Nature 402:533-537(1999). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Bone marrow; 

RA Xin H., Stephans J.C., Duan X., Harrowe G. , Kim E . , Grieshammer U., 


RA Giese K. ; 

RT "Identification of a novel aspartic-like protease differentially 

RT expressed in human breast cancer cell lines."; 

RL Submitted (JAN-1999) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RA Accarino M.P., Fumagalli P., Ottolenghi S., Taramelli R. ; 

RT "Cloning of a gene from chromosome 21 Down region encoding a potential 

RT transmembrane aspartyl protease."; 

RL Submitted (FEB-1998) to the EMBL/ GenBank/ DDB J databases. 

RN [4] 

RP SEQUENCE FROM N.A. 

RA Solans A. f Estivill X., de la Luna S.; 

RT "Cloning of a novel mammalian aspartyl protease."; 

RL Submitted (AUG-1999) to the EMBL/ GenBank/DDBJ databases. 

RN [5] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=2 012 004 3; PubMed=10656250 ; 

RA Hussain I., Powell D.J., Howlett D.R., Tew D.G., Meek T.D., 

RA Chapman C, Gloger I.S., Murphy K.E., Southan CD., Ryan D.M. , 

RA Smith T.S., Simmons D.L., Walsh F.S., Dingwall C . , Christie G. ; 

RT "Identification of a novel aspartic proteinase (Asp 2) as 

RT beta-secretase . " ; 

RL Mol. Cell. Neurosci. 14:419-427(1999). 

RN [6] 

RP SEQUENCE FROM N.A. 

RX MEDLINE— 2 0144 060 ; PubMed-106774 83 ; 

RA Lin X., Koelsch G., Wu S., Downs D., Dashti A., Tang J. ; 

RT "Human aspartic protease memapsin 2 cleaves the beta-secretase site of 

RT beta-amyloid precursor protein."; 

RL Proc. Natl. Acad. Sci. U.S.A. 97:1456-1460(2000). 

RN [7] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-20289799; PubMed=10830953 ; 

RA Hattori M. , Fujiyama A., Taylor T.D., Watanabe H., Yada T., 

RA Park H.-S., Toyoda A., Ishii K. , Totoki Y. , Choi D.-K-, Groner Y., 

RA Soeda E., Ohki M. , Takagi T., Sakaki Y., Taudien S., Blechschmidt K., 

RA Polley A., Menzel U., Delabar J., Kumpf K., Lehmann R. , Patterson D., 

RA Reichwald K., Rump A., Schillhabel M. r Schudy A. , Zimmermann W., 

RA Rosenthal A., Kudoh J., Shibuya K., Kawasaki K., Asakawa S., 

RA Shintani A., Sasaki T., Nagamine K. , Mitsuyama S., Antonarakis S.E., 

RA Minoshima S., Shimizu N . , Nordsiek G. , Hornischer K. , Brandt P., 

RA Scharfe M. , Schoen 0., Desario A., Reichelt J., Kauer G., Bloecker H., 

RA Ramser J. , Beck A., Klages S., Hennig S., Riesselmann L., Dagand E. , 

RA Wehrmeyer S., Borzym K. f Gardiner K., Nizetic D., Francis F. , 

RA Lehrach H., Reinhardt R. , Yaspo M.-L.; ( 

RT "The DNA sequence of human chromosome 21."; 

RL Nature 405:311-319(2000). 

RN [8] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Skin; 

RX MEDLINE=22388257; PubMed=12 477932 ; 

RA Strausberg R.L., Feingold E .A. , Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L. , Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J. f Hsieh F., 

RA Diatchenko L. f Marusina K., Farmer A. A. , Rubin G.M. , Hong L. , 


RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A. , Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J. , Helton E., Ketteman M. , Madan A., Rodrigues S., Sanchez A. r 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y. , Bouffard G.G., 

RA Blakesley R.W. , Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J. , Schmutz J. , Myers R.M., 

RA Butterfield Y.S.N. , Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its. content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF200342; AAF17078.1; 

DR EMBL; AF117892; AAD45240.1; -. 

DR EMBL; AF050171; AAD45963.1; -. 

DR EMBL; AF178532; AAF29494.1; -. 

DR EMBL; AF204944; AAF26368.1; -. 

DR EMBL; AF200192; AAF13714.1; -. 

DR EMBL; AL163284; CAB90458.1; -. 

DR EMBL; AL163285; CAB90554.1; 

DR EMBL; BC014453; AAH14453.1; 

DR HSSP; P00797; 2 REN . 

DR MEROPS; A01.041; -. 

DR Genew; HGNC:934; BACE2 . 

DR MIM; 605668; -. 

DR GO; GO: 0005624; C:membrane fraction; TAS . 

DR GO; GO:0006464; P:protein modification; TAS. 

DR GO; GO: 0009306; P:protein secretion; TAS. 

DR InterPro; IPR001969; Aspprotease_site . 

DR InterPro; IPR001461; AspproteaseAl . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Hydrolase; Aspartyl protease; Glycoprotein; Zymogen; Transmembrane; 


KW 

Signal . 




FT 

SIGNAL 

1 

20 

POTENTIAL. 

FT 

PROPEP 

21 


POTENTIAL. 

FT 

CHAIN 

7 

518 

BETA SECRETASE 2. 

FT 

DOMAIN 

21 

473 

EXTRACELLULAR ( POTENTIAL ) 

FT 

TRANSMEM 

474 

494 

POTENTIAL. 

FT 

DOMAIN 

495 

518 

CYTOPLASMIC (POTENTIAL) . 

FT 

ACT_SITE 

110 

110 

BY SIMILARITY. 

FT 

ACT SITE 

303 

303 

BY SIMILARITY. 


FT CARBOHYD 170 170 N-LINKED ( GLCNAC . . .) (POTENTIAL). 

FT CARBOHYD 366 366 N-LINKED (GLCNAC. . . ) (POTENTIAL). 

FT CONFLICT 36 36 A -> T (IN REF. 6). 

SQ SEQUENCE 518 AA; 56180 MW; 2E903150823760D3 CRC64; 

Query Match 44.1%; Score 1173.5; DB 1; Length 518; 

Best Local Similarity 46.1%; Pred. No. 7.7e-87; 

Matches 239; Conservative 82; Mismatches 165; Indels 33; Gaps 9; 

Qy 2 AQALPWLLLWM GAGVLPAHGTQHGIRLPLRSGLG GAPL GLR 42 

I I I I I : : I I I I I I I II II 

Db 7 ALLLPLLAQWLLRAAPELAPAPFT LPLRVAAATNRWAPTPGPGTPAERHADGLA 61 

Qy 43 LPRETDEEPEEPGRRGS FVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVG 102 

I I I : I : I I I I I : I I I : I I I : I I : I : I I I I I I I I I I I I I I I I 

Db 62 LALE — PALAS PAGAANFLAMVDNLQGDSGRGYYLEMLI GTPPQKLQI LVDTGS SNFAVA 119 

Qy 103 AAPHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAA 162 

II:: I : . : I I I I I I I I I I I I I : I I I I : I I I I : III 

Db 120 GTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIAT 179 

Qy 163 ITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLHLCGAGFPL 222 

I II: II: I I I I I I I I I I : I : I III I I I I I I I : : I I : I I : : I I I I I : 

Db 180 IFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV 239 

Qy 223 NQS EVLAS VGGSMI I GGI DHS L YTGS LWYT P I RREWYYEVI I VRVEI NGQDLKMDCKE YN 282 

I : I I I : : : I I I : III I : I I I I I : MM:: I : : : II II I : I i : II I 

Db 240 AGS GTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYN 296 

Qy 283 YDKS I VDS GTTNLRLPKKVFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FP 342 

I I : I II I I II I II I : II I : I I :: : II : I M I I I II M I I I : I I 

Db 297 ADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFP 356 

Qy 343 VISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGF 402 

I I : I I | : : : I | : I I I M I I : : I : : : I I : I II I : I : II : I M I 

Db 357 KISIYLRDENSSRSFRITILPQLYIQPMMGAGLNY-ECYRFGISPSTNALVIGATVMEGF 415 

Qy 403 YWFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVM 462 

II : I I II : M : I I I I I : : I II I I : 1 I : : I : 

Db 416 YVI FD RAQKRVG FAAS P CAE I AGAAVS E I S G P F S T ED VAS N CVP AQS L S E P I LW I VS YAL 475 

Qy 463 AAIC-ALFMLPLCLMVCQWRCLRCLRQQHDDFADDISLL 500 

: : I I : : : : I : : Mil I : : MM 
Db 476 MSVCGAILLVLIVLLLLPFRCQR — RPRDPEWNDESSL 512 


RESULT 5 
PEPl_GADMO 

ID PEPl_GADMO STANDARD; PRT; 324 AA. 

AC P56272; 

DT 15-JUL-1998 (Rel. 36, Created) 

DT 15-JUL-1998 (Rel. 36, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Pepsin IIB (EC 3.4.23.-). 

OS Gadus morhua (Atlantic cod) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Actinopterygii ; Neopterygii; Teleostei; Euteleostei; Neoteleostei ; 


OC Acanthomorpha; Paracanthopterygii ; Gadi formes ; Gadidae; Gadus . 


OX 

NCBI TaxID 

=8049; 



RN 

[1] 




RP 

SEQUENCE, 

AND X-RAY CRYSTALLOGRAPHY. 

RC 

TISSUE=Stomach; 



RA 

Karlsen S. 

, Hough 

E. , Olsen R. L. ; 

RT 

"Structure 

and proposed 

amino-acid sequence 

RT 

cod (Gadus 

morhua ) 

»• » 


RL 

Acta Crystallogr. 

D 54: 

32-46(1998) . 

CC 

-!- SIMILARITY: BELONGS 

TO PEPTIDASE FAMILY . 

DR 

PDB; 1AM5; 

24-DEC- 

97. 


DR 

InterPro; 

IPR001969; Aspprotease_site . 

DR 

InterPro; 

IPR0014 61; AspproteaseAl. 

DR 

Pfam; PF00026; asp 

>; 1- 


DR 

PRINTS; PR00792; PEPSIN 


DR 

PROSITE; PS00141; 

ASP_PROTEASE; 2. 

KW 

Hydrolase; 

Aspartyl protease; Digestion; 3D- 

FT 

ACT_SITE 

32 

32 

BY SIMILARITY. 

FT 

ACT SITE 

214 

214 

BY SIMILARITY. 

FT 

DISULFID 

45 

50 

BY SIMILARITY. 

FT 

DISULFID 

206 

209 

BY SIMILARITY. 

FT 

DISULFID 

247 

280 

BY SIMILARITY. 

FT 

STRAND 

2 

9 


FT 

TURN 

10 

12 


FT 

STRAND 

13 

20 


FT 

TURN 

21 

. 24 


FT 

STRAND 

25 

32 


FT 

TURN 

33 

34 


FT 

STRAND 

38 

40 


FT 

STRAND 

42 

42 


FT 

TURN 

43 

44 


FT 

HELIX 

48 

51 


FT 

TURN 

52 

52 


FT 

STRAND 

56 

56 


FT 

HELIX 

58 

60 


FT 

TURN 

62 

63 


FT 

STRAND 

65 

74 


FT 

STRAND 

79 

90 


FT 

STRAND 

96 

106 


FT 

TURN 

110 

114 


FT 

STRAND 

119 

122 


FT 

HELIX 

126 

128 


FT 

HELIX 

130 

132 


FT 

HELIX 

136 

142 


FT 

TURN 

143 

144 


FT 

STRAND 

150 

154 


FT 

TURN 

158 

159 


FT 

STRAND 

163 

167 


FT 

HELIX 

172 

174 


FT 

STRAND 

175 

175 


FT 

STRAND 

180 

187 


FT 

TURN 

188 

189 


FT 

STRAND 

190 

194 


FT 

STRAND 

196 

199 


FT 

TURN 

200 

201 


FT 

STRAND 

202 

203 


FT 

STRAND 

209 

213 



of a pepsin from Atlantic 


-structure . 


FT 

TURN 

215 

216 

FT 

STRAND 

220 

222 

FT 

TURN 

224 

226 

FT 

HELIX 

227 

234 

FT 

TURN 

235 

235 

FT 

STRAND 

237 

238 

FT 

STRAND 

243 

244 

FT 

TURN 

247 

248 

FT 

STRAND 

256 

260 

FT 

TURN 

261 

2 62 

FT 

STRAND 

263 

2 67 

FT 

HELIX 

269 

272 

FT 

STRAND 

273 

275 

FT 

STRAND 

280 

282 

FT 

STRAND 

284 

286 

FT 

STRAND 

296 

299 

FT 

HELIX 

301 

306 

FT 

STRAND 

307 

312 

FT 

TURN 

313 

316 

FT 

STRAND 

317 

324 

SQ 

SEQUENCE 

324 AA; 

34 


MW; EE3A6097B6941DD7 CRC64; 

Query Match 12.4%; Score 330; DB 1; Length 324; 

Best Local Similarity 27.9%; Pred. No. 3.7e-19; 

Matches 104; Conservative 67; Mismatches 136; Indels 66; Gaps 15; 

Qy 63 MVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVG AAPHPFLHRYYQRQLS 118 

:::::: II : : : I : I I : : : : I I I I I I I : I I : : : I 

Db 2 VTEQMKNEADTEYYGVISIGTPPESFKVIFDTGSSNLWVSSSHCSAQACSNHNKFKPRQS 61 

Qy 119 STYRDLRKGVYVPYTQGKWEGELGTDLVS I PHG — PNVTVRANIAAITESDKFFINGSNW 176 

I I I : I I : I I I I I I I I : I I I : : I I I ' ■ 

Db 62 STYVETGKTVDLTYGTGGMRGILGQDTVSVGGGSDPNQELG ESQTEPGPFQA-AAPF 117 

Qy 177 EGI LGLAYAEIARPDDSLEP FFDSLVKQTHV- PNLFSLHLCGAGFPLNQS EVLASVGGSM 235 

: I I I I I I I II III:: I : I : I I I : I I I 1 I I I : 
Db 118 DGILGLAYPSIAAA — GAVPVFDNMGSQSLVEKDLFSFYLSGGG — ANGSEVM 166 

Qy 236 IIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMD-CKEYNYDKSIVDSGTTN 294 

: I I : I : I I I I I : : I : I I : : I : : : I I I : I : : | | | : | | : 

Db 167 -LGGVDNSHYTGSIHWIPVTAEKYWQVALDGITVNGQTAACEGC QAIVDTGTSK 219 

Qy 295 LRLPKWFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTN 354 

: I : | | | : : | : I I : I I I 
Db 220 IVAPVSALANIMKDIGASEN QGEMMGN CASVQSLPDITF TI 260 

Qy 355 QSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGT VMGAVIMEGFYWF 406 

: : I I : : II I : I : | | : : | : : 

Db 2 61 NGVKQPLPPSAYIEGDQAFCTS GLGS S GVP SNT S ELWI FGDVFLRNYYT I Y 311 

Qy 4 07 DRARKRIGFAVSA 419 

II : : II I : I 

Db 312 D RTNN KVG FAPAA 324 


RESULT 6 
CATD BOVIN 


ID CATD_BOVIN STANDARD; PRT; 390 AA. 

AC P80209; Q9TS27; 

DT 01-JUL-1993 (Rel. 26, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Cathepsin D precursor (EC 3.4.23.5). 

GN CTSD. 

OS Bos taurus (Bovine) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Bovinae; Bos. 

OX NCBI_TaxID-9913; 

RN [1] 

RP SEQUENCE OF 1-48. 

RC TISSUE=Milk; 

RX MEDLINE=932 02276; PubMed=8454 061 ; 

RA Larsen L.B.,'Boisen A., Petersen T.E.; 

RT "Procathepsin D cannot autoactivate to cathepsin D at acid pH." ; 

RL FEBS Lett. 319:54-58(1993). 

RN [2] 

RP SEQUENCE OF 45-390, AND X-RAY CRYSTALLOGRAPHY (3 ANGSTROMS) . 

RC TISSUE=Liver; 

RX MEDLINE=9 32 2 3670; PubMed=8 4 67789; 

RA Metcalf P., Fusek M. ; 

RT "Two crystal structures for cathepsin D: the lysosomal targeting 

RT signal and active site." ; 

RL EMBO J. 12:1293-1302(1993). 

CC -!- FUNCTION: Acid protease active in intracellular protein breakdown. 

CC -!- CATALYTIC ACTIVITY: Specificity similar to, but narrower than, 
CC that of pepsin A. Does not cleave the 4-Gln- | -His-5 bond in B 


CC 

chain 

of insulin. 

CC 

-!- SUBUNIT: CONSISTS OF A LIGHT CHAIN AND A HEAVY CHAIN. 

CC 

-!- SUBCELLULAR LOCATION: Lysosomal. 

CC 

-!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

DR 

HSSP; P07339; 1LYB. 


DR 

MEROPS; A01.009; -. 


DR 

InterPro; 

IPR001969 

; Aspprotease site. 

DR 

InterPro; 

IPR001461; AspproteaseAl . 

DR 

Pfam; PF00026; asp; 

1. 

DR 

PRINTS; PR00792; PEPSIN. 

DR 

PROSITE; 

PS00141; ASP PROTEASE; 2. 

KW 

Hydrolase 

; Aspartyl 

protease; Glycoprotein; Lysosome; Zymogen. 

FT 

PROPEP 

1 

44 ACTIVATION PEPTIDE. 

FT 

CHAIN 

45 

390 CATHEPSIN D. 

FT 

ACT_SITE 

77 

77 

FT 

ACT_SITE 

273 

273 

FT 

DISULFID 

71 

140 

FT 

DISULFID 

90 

97 

FT 

DISULFID 

264 

268 

FT 

DISULFID 

307 

344 

FT 

CARBOHYD 

114 

114 N-LINKED (GLCNAC. . .) (POTENTIAL) 

FT 

CARBOHYD 

241 

241 N-LINKED (GLCNAC. . .) (POTENTIAL) 

SQ 

SEQUENCE 

390 AA; 

42488 MW; 5B38AA1C33C4 8D35 CRC64; 


Query Match 11.8%; Score 314.5; DB 1; 

Best Local Similarity 28.0%; Pred. No. 8.4e-18; 
Matches 113; Conservative 72; Mismatches 128; 


Length 390; 

Indels 91; Gaps 


21; 


Qy 53 EPG-RRGSFVEMVDNLRGKSGQGYWEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFL — 109 

II 1:1 I : : I I I I : : I : I I I : : I I I I : I I : I 

Db 39 E P AVRQ GPIPELLKN YM D AQ YYGEIGIGTPPQCFTWFDTGSANLWVPSIHCKLLDI 95 

Qy 110 HRYYQRQLS STYRDLRKGVY — VP YTQGKWEGELGTDLVS I PHGPN VTVR 157 

III I I I I : : I : I I II I I I : I I : Ml: 

Db 96 ACWTHRKYNSDKSSTY — VKNGTTFDIHYGSGSLSGYLSQDTVSVPCNPSSSSPGGVTVQ 153 

Qy 158 ANI--AAITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHV-PNLFSLH 214 

II: II : : : I I I I : I I I : : : : I I I : I : : I I I : I I 

Db 154 RQTFGEAI KQPGWFI-AAKFDGILGMAYPRIS — VNNVLPVFDNLMQQKLVDKNVFS — 2 08 

Qy 215 LCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDL 274 

III:: I I I : : : II I III : : I : I :::::::: I I 

Db 209 FFLNR- DPKAQPGGELMLGGTD S KYYRGS LMFHNVT RQAYWQI HMDQLDV- GS S L 2 61 

Qy 275 KMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLV-CWQA 333 

: I I : : | | | : | I : : I : I : I I : I I : : I : 

Db 262 TV-CK— GGCEAIVDTGTSLIVGPVEEVRELQKAIGAVPLIQ GEYMIPCEKV 310 

Qy 334 GTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGT- 392 

: | : : : | | : I I I : i I : I I : I 

Db 311 SS LPEVTVKLGG KDYALSPED- YALKVSQAETTVC 344 

Qy 393 VMGAVIMEGFYWFDRARKRI GFAVSA 419 

: : I I : : I I I I I : I : I I : I 
Db 345 LSGFMGMDIPPPGGPLWILGDVFIGRYYTVFDRDQNRVGLAEAA 388 


RESULT 7 
PEPl RABIT 


ID PEP1_RABIT STANDARD; PRT; 387 AA. 

AC P28712; 

DT 01-DEC-1992 (Rel. 24, Created) 

DT 01-DEC-1992 (Rel. 24, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Pepsin II-l precursor (EC 3.4.23.1) (Pepsin A). 

OS Oryctolagus cuniculus (Rabbit) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 

OX NCBI_TaxID=9986; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-91009127; PubMed=212 9536 ; 

RA Kageyama T., Tanabe K., Koiwai O. ; 

RT "Structure and development of rabbit pepsinogens. Stage-specific 

RT zymogens, nucleotide sequences of cDNAs, molecular evolution, and 

RT gene expression during development."; 

RL J. Biol. Chem. 265:17031-17038(1990). 

CC -!- FUNCTION: SHOWS PARTICULARLY BROAD SPECIFICITY; ALTHOUGH BONDS 

CC INVOLVING PHENYLALANINE AND LEUCINE ARE PREFERRED, MANY OTHERS ARE 

CC ALSO CLEAVED TO SOME EXTENT. 

CC -!- CATALYTIC ACTIVITY: Preferential cleavage: hydrophobic, preferably 
CC aromatic, residues in PI and Pi 1 positions. Cleaves 1-Phe- | -Val-2 , 

CC 4 -Gin- | -His-5, 13-Glu- | -Ala-14 , 14-Ala- | -Leu-15 , 15-Leu- | -Tyr-16, 

CC 16-Tyr- | -Leu-17, 2 3-Gly- | -Phe-24 , 24-Phe- | -Phe-25 and 25-Phe-|- 


CC Tyr-26 bonds in the B chain of insulin. 

CC -!- DEVELOPMENTAL STAGE : PEPSINOGENS IN GROUP I, II, AND III WHERE 
CC THE PREDOMINANT ZYMOGENS AT LATE POSTNATAL STAGE. 

CC -!- MISCELLANEOUS: THE EXPRESSION OF PEPSINOGEN GENES IS REGULATED BY 

CC HORMONES AND RELATED SUBSTANCES. 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

DR PIR; B38302; B38302. 

DR HSSP; P00791; IPSA. 

DR MEROPS; A01.001; 

DR InterPro; IPR001969; Aspproteasejsite . 

DR InterPro; IPR001461; AspproteaseAl . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Hydrolase; Aspartyl protease; Digestion; Zymogen; Signal; 
KW Phosphorylation; Multigene family. 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 


66; Gaps 15; 

Qy 75 Y YVEMT VG S P PQT LN I LVDT G S S N FAVG AAPHP FLHRYYQRQLS ST YRDLRKGVYV 130 

I : : : : I : I II :: I I I I I I I : : III:: III:: : : : 

Db 75 YFGTISIGTPPQEFTVI FDTGSSNLWVPSTYCSSLACFLHKRFNPDDSSTFQATSETLSI 134 

Qy 131 PYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESD KFFINGSNWEGILGLAYAEI 187 

I I Mill: I : I ::::: I : : :: I I M I I I I 

Db 135 TYGTGSMTGILGYDTVKV GNIEDTNQIFGLSKTEPGITFLV — APFDGILGLAYPSI 189 

Qy 188 ARPDDSLEPFFDSLVKQTHV-PNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYT 24 6 

: I : III:: : I : I I I : : I I I : : I I I I I I I 

Db 190 SAS DAT — PVFDNMWNEGLVS EDLFS VYLS SNG EKGSMVMFGGI DS S YYT 237 

Qy 247 GS LWYT P I RREW Y YEVI I VRVE I NGQ DLKM — DCKEYNYDKS IVDSGTTNLRLPKKVFEA 304 

I I I : I : ||:::: : I I I : : I : : : I ! : I I : I I 

Db 2 38 GSLNWVPVSHEGYWQITMDSITINGETIACADSC QAWDT GT S L LAG PT S AI S K 2 91 

Qy 305 AVKSIKAASSTEKFPDGFWLGEQLV-CWQAGTTPWNIFPVISLYLMGEVTNQSFRITILP 363 

II:: I I I : : I : I : I II 

Db 292 IQSYIGASKNL LGENIISCSAIDSLPDIVF TINN 325 

Qy 364 QQYLRPVED-VATSQDDC YKFAISQSSTGT — VMGAVIMEGFYWFDRARKRI GFAV 417 

II I : III : : I I : : I I : : : I I I M : : I I 

Db 326 VQYPLPASAYILKEDDDCLSGFDGMNLDTSYGELWILGDVFIRQYFTVFDRANNQVGLAA 385 


SIGNAL 

1 

15 




PROPEP 

16 

59 


ACTIVATION PEPTIDE. 


CHAIN 

60 

387 


PEPSIN II-l. 


MOD RES 

129 

129 


PHOSPHORYLATION (POTENTIAL) . 

ACT_SITE 

93 

93 


BY SIMILARITY. 


ACT SITE 

276 

276 


BY SIMILARITY. 


DISULFID 

106 

111 


BY SIMILARITY. 


DISULFID 

267 

271 


BY SIMILARITY. 


DISULFID 

310 

343 


BY SIMILARITY. 


> SEQUENCE 

387 AA; 

42070 MW; A6EC48F7 1554 1A4 8 

CRC64; 

Query Match 


11. 

6%; 

Score 309; DB 1; 

Length 387; 

Best Local Similarity 

27. 

1%; 

Pred. No. 2.3e-17; 


Matches 98; 

Conservative 


68; Mismatches 130; 

Indels 


Qy 


418 SA 419 
: I 


Db 386 AA 387 

RESULT 8 
PEP4_MACFU 

ID PEP4_MACFU STANDARD; PRT; 388 AA. 

AC P27678; 

DT 01-AUG-1992 (Rel. 23, Created) 

DT 01-AUG-1992 (Rel. 23, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Pepsin A-4 precursor (EC 3.4.23.1) (Pepsin I/II). 

GN PGA. 

OS Macaca fuscata fuscata (Japanese macaque) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Cercopithecidae ; 

OC Cercopithecinae; Macaca. 

OX NCBI_TaxID=9543; 

RN [1] 

RP SEQUENCE FROM N.A., AND SEQUENCE OF 16-70. 

RC TISSUE=Gastric mucosa; 

RX MEDLINE=92 037 645; PubMed= 1935977; 

RA Kageyama T., Tanabe K., Koiwai O. ; 

RT "Development-dependent expression of isozymogens of monkey 

RT pepsinogens and structural differences between them."; 

RL Eur. J. Biochem. 202:205-215(1991). 

CC -!- FUNCTION: SHOWS PARTICULARLY BROAD SPECIFICITY; ALTHOUGH BONDS 

CC INVOLVING PHENYLALANINE AND LEUCINE ARE PREFERRED, MANY OTHERS ARE 

CC ALSO CLEAVED TO SOME EXTENT. 

CC -!- CATALYTIC ACTIVITY: Preferential cleavage: hydrophobic, preferably 

CC aromatic, residues in PI and PI 1 positions. Cleaves 1-Phe- I -Val-2 , 

CC 4-Gln-| -His-5, 13-Glu- | -Ala-14 , 14 -Ala- | -Leu-15 , 15-Leu- | -Tyr-16, 

CC 16-Tyr-|-Leu-17, 23-Gly- I -Phe-24 , 24-Phe- | -Phe-25 and 25-Phe-|- 

CC Tyr-26 bonds in the B chain of insulin. 

CC -!- MISCELLANEOUS: THE EXPRESSION OF PEPSINOGEN GENES IS REGULATED BY 

CC HORMONES AND RELATED SUBSTANCES. 

CC -!- MISCELLANEOUS: EACH PEPSINOGEN IS CONVERTED TO CORRESPONDING 

CC PEPSIN AT PH 2.0 IN PART AS A RESULT OF THE RELEASE OF A 47 AA 

CC ACTIVATION SEGMENT AND IN PART AS A RESULT OF STEPWISE PROTEOLYTIC 

CC CLEAVAGE VIA AN INTERMEDIATE FORM(S) . 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb~sib . ch) . 

CC 

DR EMBL; X59753; CAA42425.1; 

DR PIR; S19682; S19682. 

DR HSSP; P00790; 1PSN. 

DR MEROPS; A01.001; 

DR InterPro; IPR001969; Aspprotease_site . 

DR InterPro; IPR001461; AspproteaseAl . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 


DR 

PROSITE; 

PS00141; 

ASP_PROTEASE; 2. 

KW 

Hydrolase 

; Aspartyl protease; 

Digestion; Zymogen 

KW 

Signal; Glycoprotein. 


FT 

SIGNAL 

1 

15 

BY SIMILARITY. 

FT 

PROPEP 

16 

38 

ACTIVATION PEPTIDE. 

FT 

PROPEP 

39 

62 

ACTIVATION PEPTIDE. 

FT 

CHAIN 

63 

388 

PEPSIN A-4. 

FT 

ACT_SITE 

94 

94 

BY SIMILARITY. 

FT 

ACT SITE 

277 

277 

BY SIMILARITY. 

FT 

DISULFID 

107 

112 

BY SIMILARITY. 

FT 

DISULFID 

268 

272 

BY SIMILARITY. 

FT 

DISULFID 

311 

344 

BY SIMILARITY. 

FT 

CARBOHYD 

88 

88 

N-LINKED (GLCNAC. . 

SQ 

SEQUENCE 

388 AA; 

41955 MW; 

A2923AB1F7FCDEB9 < 


♦ ) - 

CRC64; 


Query Match 11.5%; Score 307.5; DB 1; 

Best Local Similarity 27.6%; Pred. No. 3.1e-17; 
Matches 108; Conservative 65; Mismatches 135; 


Length 38 8; 
Indels 83; 


Gaps 17; 


Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 


44 PRETDEEPEEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGA 103 

I I I : I I : : : : : I : : : I : I I :: I I I I I I I 

60 PTLIDEQPLE NYLDV EYFGTIGIGTPAQNFTWFDTGSSNLWV — 102 

104 APHPFL HRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTV 156 

I : I : I I I I I I I I : I I Mill: : : 

103 -PSVYCYSLACMDHNLFNPQDSSTYRATSKTVSITYGTGSMTGILGYDTVKV GGISD 158 

157 RANIAAITESDK-FFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHV-PNLFSLH 214 

I : : I : : I I : : : : I I I I I I I I : III:: I I : I I I : : 

159 TNQIFGLSETEPGFFLYFAPFDGILGLAYPSIS--SSGATPVFDNIWNQRLVSQDLFSVY 216 

215 LCGAGFPLNQSEVLASVGGSMI I GGI DHSLYTGS LWYTPI RREWYYEVI I VRVEINGQDL 274 

I : I I I : I I I I I I I I I I I : I : I I : : : : : : I I : : 

217 LSAD DQS GSWI FGGI DS S YYTGS LNWVPVS VEGYWQI S VDS ITMNGKTI 2 66 

275 — KMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLV-CW 331 

I : : | | | : | | : | | | | : : : : I I : I I 

267 ACAKGC QAIVDTGTSLLTGPTSPIANIQSDIGASENSD GEMWSCS 312 

332 QAGTT PWNI FPVI S L YLMGEVTNQS FRI T I LPQQ Y~ LRPVEDVAT SQDDC YK FAI 385 

: I : I 111111:111 
313 AISSLPDIVF TINGVQYPLPPSAYILQSQGSCTSGFQGMDVP 354 

386 SQSSTGTVMGAVIMEGFYWFDRARKRIGFA 416 

: : I : : I I : : : I I I I I : : I I 

355 TESGELWILGDVFIRQYFTVFDRANNQVGLA 385 


RESULT 9 
PEPA_CHICK 

ID PEPA_CHICK STANDARD; PRT; 367 AA. 

AC P00793; 

DT 21-JUL-1986 (Rel. 01, Created) 

DT 21-JUL-1986 (Rel. 01, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Pepsin A precursor (EC 3.4.23.1) . 

OS Gallus gallus (Chicken) . 


OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 

OC Gallus . 

OX NCBI_TaxID=9031; 

RN [1] 

RP SEQUENCE. 

RX MEDLINE=84004412; PubMed-6617 663 ; 

RA Baudys M. , Kostka V. ; 

RT "Covalent structure of chicken pepsinogen."; 

RL Eur. J. Biochem. 136:89-99(1983). 

CC -!- FUNCTION: SHOWS PARTICULARLY BROAD SPECIFICITY; ALTHOUGH BONDS 

CC INVOLVING PHENYLALANINE AND LEUCINE ARE PREFERRED, MANY OTHERS ARE 

CC ALSO CLEAVED TO SOME EXTENT. 

CC -!- CATALYTIC ACTIVITY: Preferential cleavage: hydrophobic, preferably 
CC aromatic, residues in PI and PI* positions. Cleaves 1-Phe- I -Val-2 , 

CC 4-Gln- I -His-5, 13-Glu- I -Ala-14 , 14-Ala- | -Leu-15, 15-Leu- I -Tyr-16, 

CC 16-Tyr- | -Leu-17, 23-Gly- | -Phe-24 , 2'4-Phe- | -Phe-25 and 25-Phe-|~ 

CC Tyr-26 bonds in the B chain of insulin. 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

DR HSSP; P00794; 4 CMS . 

DR MEROPS; A01.UPW; 

DR InterPro; IPR001969; Aspprotease_site . 

DR InterPro; IPR001461; AspproteaseAl . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; AS P PROTEASE; 2. 

KW Hydrolase; Aspartyl protease; Digestion; Zymogen; Glycoprotein. 


FT 

PROPEP 

1 

42 

ACTIVATION PEPTIDE. 


FT 

CHAIN 

43 

3 67 

PEPSIN A. 


FT 

ACT SITE 

77 

77 



FT 

ACT_SITE 

260 

260 



FT 

CARBOHYD 

113 

113 

N-LINKED (GLCNAC. , 

. .) . 

FT 

DISULFID 

90 

95 



FT 

DISULFID 

251 

255 



FT 

DISULFID 

290 

323 



SQ 

SEQUENCE 

367 AA; 

40431 

MW; 0C547E7FD8F5B341 

CRC64; 


Query Match 11.4%; Score 305; DB 1; Length 367; 

Best Local Similarity 24.0%; Pred. No. 4.5e-17; 

Matches 88; Conservative 70; Mismatches 124; Indels 84; 


Gaps 13; 


Qy 

Db 


7 5 Y YVEMT VG S P P QT LN I LVDT G S S N FAVGAAP H P FL HRYYQRQLS STYRDLRKG 127 

II : : : I : I I : : : I I I I I I I I : I : : I I I I : 

59 YYGTI S I GTPQQDFSVI FDTGS SNLWV PSI YCKSSACSNHKRFDPSKSSTYVSTNET 115 


Qy 128 VYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDK-FFINGSNWEGILGLAYAE 186 

II:! I Mill:: : : | : | : : | : : | | : : | | | | | | : 

Db 116 VYIAYGTGSMSGILGYDTVAV SSIDVQNQIFGLSETEPGSFFYYCNFDGILGLAFPS 172. 

Qy 187 IARPDDSLEPFFDSLVKQTHV-PNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLY 245 

I : | | | : : : | | : | I I : : I I I : : I I I I : 

Db 173 IS — S S GAT PVFDNMMSQHLVAQDLFS VYLS KDG ETGS FVLFGGI DPNYT 220 


Qy 246 TGSLWYTPIRREWYYEVIIVRVEINGQDLK--MDCKEYNYDKSIVDSGTTNLRLPKKVFE 303 

I : : : I : I I : : : : I | : : : I : : I I I : I I : I : I : : 

Db 221 TKGIYWVPLSAETYWQITMDRVTVGNKYVACFFTC QAIVDTGTSLLVMPQGAYN 274 


Qy 304 AAVKSIKAASSTE KFPDGFWLGEQLVCWQAGTTPWNI FPVI SLYLMGEVTNQS 356 

: I : : I I III : : : : I 
Db 275 RIIKDLGVSSDGEISCDDISKLPD VTFHINGHA 307 

Qy 357 FRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGT VMGAVIMEGFYWFDRAR 410 

: I :: I II : : I I : : I I : I I I I 

Db 308 FTLPASAYVLNEDGSCMLGFENMGTPTELGEQWILGDVFIREYYVI FDRAN 358 

Qy , 411 KRIGFA 416 

: : I : 

Db 359 NKVGLS 364 


RESULT 10 
PEPE_CHICK 

ID PEPE_CHICK STANDARD; PRT; 383 AA. 

AC P16476; 

DT 01-AUG-1990 (Rel. 15, Created) 

DT 01-AUG-1990 (Rel. 15, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Embryonic pepsinogen precursor (EC 3.4.23.-). 

OS Gallus gallus (Chicken) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Archosauria; Aves; Neognathae; Gallif ormes ; Phasianidae; Phasianinae; 

OC Gallus . 

OX NCBI_TaxID=9031; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE- 88227903; PubMed= 3131317; 

RA Hayashi K . , Agata K. , Mochii M. , Yasugi S., Eguchi G. , Mizuno T.; 

RT "Molecular cloning and the nucleotide sequence of cDNA for embryonic 

RT chicken pepsinogen: phylogenetic relationship with prochymosin . " ; 

RL J. Biochem. 103:290-296(1988). 

CC -!- DEVELOPMENTAL STAGE: SPECIFICALLY SECRETED DURING THE EMBRYONIC 
CC PERIOD IN THE CHICKEN PROVENTRICULUS (GLANDULAR STOMACH) . 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; D00215; BAA00153.1; -. 

DR PIR; A41443; A41443. 

DR HSSP; P00794; 4 CMS . 

DR MEROPS; AO 1.02 8; -. 

DR InterPro; IPR001969; Aspprotease_site . 

DR InterPro; IPR001461; AspproteaseAl . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Hydrolase; Aspartyl protease; Digestion; Signal; Glycoprotein. 

FT SIGNAL 1 16 POTENTIAL. 

FT CHAIN 17 383 EMBRYONIC PEPSINOGEN. 


FT 

ACT_SITE 

94 

94 

BY SIMILARITY. 


FT 

ACT SITE 

276 

276 

BY SIMILARITY. 


FT 

DISULFID 

107 

112 

BY SIMILARITY. 


FT 

DISULFID 

267 

271 

BY SIMILARITY. 


FT 

DISULFID 

310 

344 

BY SIMILARITY. 


FT 

CARBOHYD 

132 

132 

N-LINKED ( GLCNAC . . 

.) (POTENTIAL) 

FT 

CARBOHYD 

204 

204 

N-LINKED (GLCNAC. . 

.) (POTENTIAL) 

FT 

CARBOHYD 

309 

309 

N-LINKED (GLCNAC. . 

.) (POTENTIAL) 

FT 

CARBOHYD 

350 

350 

N-LINKED (GLCNAC. . 

.) (POTENTIAL) 

FT 

VARIANT 

51 

51 

T -> S. 


SQ 

SEQUENCE 

383 AA; 

41719 

MW; 1642796871611F54 

CRC64; 

Query Match 


11.3^ 

1} Score 301.5; DB 1; 

Length 3 83; 

Best Local Similarity- 

25.2^ 

h; Pred. No. 9.1e-17; 


Matches 90; 

Conservative 

76; Mismatches 124; 

Indels 67; 


Gaps 


14; 


Qy 

Db 


75 Y YVEMT VG S P P QT LN I LVDT G S S N FAVGA APHPFLHRYYQRQLSSTYRDLRKGVYV 130 

II : : : I : I I I : : I I I I I I I : : I I : : MM: : : : 

76 YYGTISIGTPPQDFTWFDTGSSNLWVPSVSCTSPACQSHQMFNPSQSSTYKSTGQNLSI 135 


Qy 


Db 


131 PYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARP 190 

I I || : | | | : : : : : : I MM : M I I I I I : I 

136 HYGTGDMEGTVGCDTVTVASLMDTNQLFGLST-SEPGQFFVY-VKFDGILGLGYPSLAA- 192 


Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 


191 DDSLEPFFDSLVKQTHV-PNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSL 24 9 

I : I I I : : I : : : M I M M M : : I M I I M I I : 

193 -DGITPVFDNMVNESLLEQNLFSVYLS REPMGSMWFGGI DES YFTGS I 240 

250 WYTPIRREWYYEVIIVRVEINGQDL — KMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVK 307 

: I : : I : : : : : M M : I : M M M I : : I 

241 NWIPVSYQGYWQISMDSIIVNKQEIACSSGC QAI IDTGTSLVAGPASDINDIQS 294 

308 S I KAAS STEKFPDGFWLGEQLVCWQAGTT PWNI FPVI SL YLMGEVTNQSFRITILP 363 

: : I M III I : : : : : : M : 
295 AVGANQNT YGEYSV NCSHILAMPDWFVIGGI 326 

364 QQYLRPVEDVA TSQDDCYKFAISQSSTGTVMGAVIMEGFYWFDRARKRIGFA 416 

II II : I I I : M : : I I : M M II I Mil 

327 -QY — PVPALAYTEQNGQGTCMSS FQNS SADLWI LGDVFI RVYYS I FDRANNRVGLA 380 


RESULT 11 
CAT E_HUMAN 

ID CAT E_HUMAN STANDARD; PRT; 396 AA. 

AC P14091; 

DT 01-JAN-1990 (Rel. 13, Created) 

DT 01-JAN-1990 (Rel. 13, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE Cathepsin E precursor (EC 3.4.23.34). 

GN CTSE. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
OX NCBI_TaxID=9606; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=89380302; PubMed=2674 14 1 ; 


RA Azuma T., Pals G. , Mohandas T.K., Couvreur J.M., Taggart R.T.; 

RT "Human gastric cathepsin E. Predicted sequence,, localization to 

RT chromosome 1, and sequence homology with other aspartic 

RT proteinases."; 

RL J. Biol. Chem. 264:16748-16753(1989). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=92112877; PubMed=137047 8 ; 

RA Azuma T . , Liu W.G., Vander Laan D.J., Bowcock A.M., Taggart R.T.; 

RT "Human gastric cathepsin E gene. Multiple transcripts result from 

RT alternative polyadenylation of the primary transcripts of a single 

RT gene locus at Iq31-q32."; 

RL J. Biol. Chem. 267:1609-1614(1992). 

RN [3] 

RP SEQUENCE FROM N.A. 

RA Tatnell P.J., Kay J.; 

RT "HUman procathepsin E."; 

RL Submitted (OCT-1999) to the EMBL/GenBank/DDBJ databases. 

RN [4] 

RP SEQUENCE OF 54-68; 77-95; 141-155; 275-285 AND 389-396. 

RX MEDLINE— 902412 67 ; PubMed=2 33 4 4 4 0; 

RA Athauda S.B.P., Matsuzaki O., Kgeyama T., Takahashi K. ; 

RT "Structural evidence for two isozymic forms and the carbohydrate 

RT attachment site of human gastric cathepsin E."; 

RL Biochem. Biophys . Res. Commun. 168:878-885(1990). 

CC -!- FUNCTION: DUE OT ITS INTRACELLULAR LOCATION AND DISTRIBUTION IN 
CC LYMPHOID ASSOCIATED TISSUE, IT MAY HAVE A ROLE IN IMMUNE FUNCTION . 

CC -!- CATALYTIC ACTIVITY: Similar to cathepsin D, but slightly broader 
CC specificity. 

CC -!- SUBUNIT: Homodimer; disul fide-linked . 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; M84424; AAA52300.1; 

DR EMBL; M84413; AAA52300.1; JOINED. 

DR EMBL; M84417; AAA52300.1; JOINED. 

DR EMBL; M84418; AAA52300.1; JOINED. 

DR EMBL; M84419; AAA52300.1; JOINED. 

DR EMBL; M84420; AAA52300.1; JOINED. 

DR EMBL; M84421; AAA52300.1; JOINED. 

DR EMBL; M84422; AAA52300.1; JOINED. 

DR EMBL; J05036; AAA52130.1; -. 

DR EMBL; AJ250717; CAB82850.1; -. 

DR PIR; A42038; A34401. 

DR PDB; 1LCG; 17-APR-02. 

DR MEROPS; A01.010; -. 

DR Genew; HGNC:2530; CTSE. 

DR MIM; 116890; -. 

DR GO; GO: 0007586; P:digestion; TAS . 

DR InterPro; IPR001969; Aspprotease_site . 


DR 

InterPro; 

IPR001461; AspproteaseAl . 

DR 

Pfam; PF00026; asp; 

1. 


DR 

PRINTS; PR00792; PEPSIN. 


DR 

PROSITE; 

PS00141; ASP PROTEASE; 2. 

KW 

Hydrolase 

; Aspartyl 

protease; Glycoprotein; Zymogen; Sigi 

KW 

Polymorph 

ism; Pyrrolidone 

carboxylic acid; 3D-structure . 

FT 

SIGNAL 

1 

17 


FT 

PROPEP 

18 

53 

ACTIVATION PEPTIDE. 

FT 

CHAIN 

54 

396 

CATHEPSIN E. 

FT 

MOD RES 

18 

18 

PYRROLIDONE CARBOXYLIC ACID 

FT 

ACT_SITE 

96 

96 

BY SIMILARITY. 

FT 

ACT_SITE 

281 

281 

BY SIMILARITY. 

FT 

DISULFID 

60 

60 

INTERCHAIN (PROBABLE) . 

FT 

DISULFID 

109 

114 

BY SIMILARITY. 

FT 

DISULFID 

272 

276 

BY SIMILARITY. 

FT 

DISULFID 

314 

351 

BY SIMILARITY. 

FT 

CARBOHYD 

90 

90 

N-LINKED (GLCNAC. . .). 

FT 

CARBOHYD 

220 

220 

O-LINKED (POTENTIAL) . 

FT 

CARBOHYD 

333 

333 

O-LINKED (POTENTIAL) . 

FT 

VARIANT 

324 

324 

T -> I (IN dbSNP: 6503) . 

FT 




/FTId=VAR_014572 . 

SQ 

SEQUENCE 

396 AA; 

42793 

MW; 40B643C5FB01521E CRC64; 


Query Match 11.3%; Score 301.5; DB 1; Length 396; 

Best Local Similarity 25.8%; Pred. No. 9.6e-17; 

Matches 100; Conservative 68; Mismatches 144; Indels 75; Gaps 16; 

Qy 48 DEEPEEPGRRGS FVEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGA 103 

I : : I I : : I I : : : : I I I I I :: I I I I I I I : 

Db 63 DQSAKEP LINYLD MEYFGTI S I GS PPQNFTVI FDTGS SNLWVPSVYCT 110 

Qy 104 APHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAI 163 

: I I : I I I I I : : | | I :| I ||: I I : : : 

Db 111 SPACKTHSRFQPSQSSTYSQPGQSFSIQYGTGSLSGIIGADQVSV-EGLTWGQQFGESV 169 

Qy 164 TESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVP-NLFSLHLCGAGFPL 222 

II : | : : : : : I I I I I I : I : I I I : : : I I : I I : : : 

Db 170 TEPGQTFVD-AEFDGILGLGYPSLA — VGGVTPVFDNMMAQNLVDLPMFSVYM 219 

Qy 223 NQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYN 282 

I I : | M I I I : : I I I : I : : : I : : : : : : : I III 

Db 220 -SSNPEGGAGSELIFGGYDHSHFSGSLNWVPVTKQAYWQIALDNIQVGG — TVMFCSE — 274 

Qy 283 YDKS I VDS GTTNLRLPKKVFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FP 342 

: : I I I : I I : : I : : I I I I I : I I : I 

Db 275 GCQAIVDTGTSLITGPSDKIKQLQNAIGAAP VDGEYAVE CANLNVMP 321 

Qy 343 VISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTG 391 

: : : I I : I I : I I hi 

Db 322 DVT FT I NG VPYTLSPTAY— TLLDFVDGMQFC SSGFQGLDIHPPAG 365 

Qy 392 — TVMGAVIMEGFYWFDRARKRIGFA 416 

: : I I : I I I I I I I : I I 
Db 366 PLWI LGDVFI RQFYSVFDRGNNRVGLA 392 


RESULT 12 


CATD_HUMAN 

ID CAT D_HUMAN STANDARD; PRT; 412 AA. 

AC P07339; 

DT 01-APR-1988 (Rel. 07, Created) 

DT 01-APR-1988 (Rel. 07, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE Cathepsin D precursor (EC 3.4.23.5). 

GN CTSD. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=8 52 7 043 6; PubMed=3 927292; 

RA Faust P.L., Kornfeld S., Chirgwin J.M. ; 

RT "Cloning and sequence analysis of cDNA for human cathepsin D."; 

RL Proc. Natl. Acad. Sci. U.S.A. 82:4910-4914(1985). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=87231068; PubMed=3588310 ; 

RA Westley B.R., May F.E.B.; 

RT "Oestrogen regulates cathepsin D mRNA levels in oestrogen responsive 

RT human breast cancer cells."; 

RL Nucleic Acids Res. 15:3773-37 86(1987). 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=91299158; PubMed=2 069717 ; 

RA Redecker B., Heckendorf B. , Grosch H.W., Mersmann G. , Hasilik A.; 

RT "Molecular organization of the human cathepsin D gene."; 

RL DNA Cell Biol. 10:423-4 31(1991). 

RN [4] 

RP SEQUENCE FROM N.A. 

RC TISSUE-Kidney; 

RX MEDLINE=2238 8257; PubMed=12477 932 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H. , Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L., Marusina K. , Farmer A. A. , Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A. , McEwan P.J., McKernan K.J., Malek J. A. , Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R. A. , 

RA Fahey J., Helton E., Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y. , Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M. , 

RA Butterfield Y.S.N., Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [5] 

RP SEQUENCE OF 1-22 FROM N.A. 


RX MEDLINE=94085791; PubMed=8262386; 

RA May F.E., Smith D.J., Westley B.R.; 

RT "The human cathepsin D-encoding gene is transcribed from an estrogen- 

RT regulated and a constitutive start point."; 

RL Gene 134:277-282(1993). 

RN [6] 

RP SEQUENCE OF 1-22 FROM N.A. 

RX MEDLINE=95021301; PubMed=79354 85 ; 

RA Augereau P., Miralles F., Cavailles V., Gaudelet C, Parker M. , 

RA Rochefort H. ; 

RT "Characterization of the proximal estrogen-responsive element of 

RT human cathepsin D gene."; 

RL Mol. Endocrinol. 8:693-703(1994). 

RN [7] 

RP SEQUENCE OF 170-180. 

RC TISSUE=Liver; 

RA Hochstrasser D.F., Frutiger S., Paquet N., Bairoch A., Ravier F., 

RA Pasquali C, Sanchez J.-C, Tissot J.-D., Bjellqvist B., Vargas R. , 

RA Appel R.D., Hughes G.J.; 

RL Submitted (JUN-1992) to the SWISS-PROT data bank. 

RN [8] 

RP VARIANT VAL-58. 

RX MEDLINE=2 017 9010; PubMed= 10716266; 

RA Papassotiropoulos A., Bagli M. , Kurz A., Kornhuber J., Forstl H., 

RA Maier W. , Pauls J., Lautenschlager N., Heun R. ; 

RT "A genetic variation of cathepsin D is a major risk factor for 

RT Alzheimer's disease."; 

RL Ann. Neurol. 47:399-403(2000). 

RN [9] 

RP X-RAY CRYSTALLOGRAPHY (3 ANGSTROMS ) . 

RC TISSUE=Spleen; 

RX MEDLINE=93223670; PubMed=84 677 8 9 ; 

RA Metcalf P., Fusek M. ; 

RT "Two crystal structures for cathepsin D: the lysosomal targeting 

RT signal and active site."; 

RL EMBO J. 12:1293-1302(1993). 

RN [10] 

RP X-RAY CRYSTALLOGRAPHY (2.5 ANGSTROMS). 

RC TISSUE=Liver; 

RX MEDLINE=93342076; PubMed=8393577 ; 

RA Baldwin E . T . , Bhat T.N., Gulnik S., Hosur M.V. , Sowder R.C. II, 

RA Cachau R.E., Collins J., Silva A.M., Erickson J.W. ; 

RT "Crystal structures of native and inhibited forms of human cathepsin 

RT D: implications for lysosomal targeting and drug design."; 

RL Proc. Natl. Acad. Sci. U.S.A. 90:6796-6800(1993). 

CC -!- FUNCTION: Acid protease active in intracellular protein breakdown. 
CC Involved in the pathogenesis of several diseases such as breast 

CC cancer and possibly Alzheimer's disease. 

CC -!- CATALYTIC ACTIVITY: Specificity similar to, but narrower than, 
CC that of pepsin A. Does not cleave the 4-Gln- | -His-5 bond in B 

CC chain of insulin. 

CC -!- SUBUNIT: CONSISTS OF A LIGHT CHAIN AND A HEAVY CHAIN. 

CC -!- SUBCELLULAR LOCATION: Lysosomal. 

CC -!- POLYMORPHISM: The Val-58 allele is significantly overrepresented 
CC in demented patients (11.8%) compared with nondemented controls 

CC (4.9%). Carriers of the Val-58 allele have a 3.1-fold increased 

CC risk for developing AD than noncarriers . 


CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; M11233; AAB59529.1; -. 

DR EMBL; X05344; CAA28955.1; 

DR EMBL; M63138; AAA51922.1; -. 

DR EMBL; M63134; AAA51922.1; JOINED . 

DR EMBL; M63135; AAA51922.1; JOINED. 

DR EMBL; M63136; AAA51922.1; JOINED . 

DR EMBL; M63137; AAA51922.1; JOINED. 

DR EMBL ; BC016320; AAH16320.1; 

DR EMBL; L12980; AAA16314.1; 

DR EMBL; S74689; AAD14156.1; -. 

DR EMBL; S52557; AAD13868.1; -. 

DR PIR; A25771; KHHUD. 

DR PDB; 1LYA; 31-JAN-94. 

DR PDB; 1LYB; 31-JAN-94. 

DR PDB; 1LYW; 22-JUL-99. 

DR MEROPS; A01.009; 

DR SWISS-2DPAGE; P07339; HUMAN. 

DR Siena-2DPAGE; P07339; 

DR Genew; HGNC:2529; CTSD. 

DR MIM; 116840; 

DR GO; GO: 0004192; F:cathepsin D activity; TAS . 

DR InterPro; IPR001969; Aspprotease_site . 

DR InterPro; IPR001461; AspproteaseAl . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE ; 2. 

KW Hydrolase; Aspartyl protease; Glycoprotein; Lysosome; Signal; Zymogen; 

KW Polymorphism; Alzheimer's disease; 3D-structure . 


FT 

SIGNAL 

1 

18 


FT 

PROPEP 

19 

64 

ACTIVATION PEPTIDE. 

FT 

CHAIN 

65 

412 

CATHEPSIN D. 

FT 

CHAIN 

65 

161 

CATHEPSIN D LIGHT CHAIN (PROBABLE) . 

FT 

CHAIN 

169 

412 

CATHEPSIN D HEAVY CHAIN (PROBABLE) . 

FT 

ACT SITE 

97 

97 


FT 

ACT SITE 

295 

295 


FT 

DISULFID 

91 

160 


FT 

DISULFID 

110 

117 


FT 

DISULFID 

286 

290 


FT 

DISULFID 

329 

366 


FT 

CARBOHYD 

134 

134 

N-LINKED (GLCNAC. . .). 

FT 

CARBOHYD 

263 

263 

N-LINKED (GLCNAC. . .). 

FT 

VARIANT 

58 

58 

A -> V (ASSOCIATED WITH INCREASED RISK 

FT 




AD; POSSIBLY INFLUENCES SECRETION AND 

FT 




INTRACELLULAR MATURATION; dbSNP: 17571) 

FT 




/FTId=VAR_011621. 

FT 

STRAND 

67 

74 


FT 

TURN 

75 

77 



FT 

STRAND 

78 

85 

FT 

TURN 

86 

89 

FT 

STRAND 

90 

97 

FT 

TURN 

98 

99 

FT 

STRAND 

103 

107 

FT 

TURN 

108 

109 

FT 

TURN 

112 

113 

FT 

HELIX 

115 

118 

FT 

TURN 

119 

119 

FT 

STRAND 

123 

123 

FT 

HELIX 

125 

127 

FT 

TURN 

129 

130 

FT 

STRAND 

132 

141 

FT 

STRAND 

146 

158 

FT 

STRAND 

172 

184 

FT 

HELIX 

188 

192 

FT 

STRAND 

197 

200 

FT 

HELIX 

204 

206 

FT 

HELIX 

208 

210 

FT 

HELIX 

214 

220 

FT 

TURN 

221 

222 

FT 

STRAND 

228 

233 

FT 

STRAND 

243 

247 

FT 

TURN 

248 

248 

FT 

HELIX 

, 252 

254 

FT 

STRAND 

' 255 

263 

FT 

STRAND 

267 

267 

FT 

TURN 

268 

269 

FT 

STRAND 

270 

279 

FT 

TURN 

280 

281 


Query Match 11.3%; Score 300.5; DB 1; Length 412; 

Best Local Similarity 26.9%; Pred. No. 1.2e-16; 

Matches 123; Conservative 68; Mismatches 170; Indels 97; Gaps 21; 

Qy 5 LPWLLLWMGAGVLPAHGTQHGIRLPLR SGLGGAPLGL RLP 44 

I I I : I II : I : I I I : I I : I : I 

Db 7 LPLALCLLAA PASAL VR IPLHKFTSI RRTMS E VGG S VE D L I AKG P VS K Y S QAVP 60 

Qy 45 RETDEEPEEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAA 104 

I : I I : : I I I I : : I : M I : : I I I I I I I : 

Db 61 AVTE GPIPEVLKNYMDAQ YYGEI GI GT PPQCFTWFDTGS SNLWVPS I 108 

Qy 105 PHPFL HRYYQRQLS ST YRDLRKGVYVP YTQGKWEGELGTDLVS IP 149 

I II I I I I : I I I I I I I : I 

Db 109 HCKLLDIACWIHHKYNSDKSSTYVKNGTSFDIHYGSGSLSGYLSQDTVSVPCQSASSASA 168 

Qy 150 HGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHV-P 208 

I I I : : : : | | | | : | | | : : : : | | | : | : : | I 

Db 169 LGGVKVERQVFGEATKQPGITFIAAKFDGILGMAYPRIS — VNNVL PVFDN LMQQKLVDQ 22 6 

Qy 209 NLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVE 268 

I : I I : I : I II : : : I I I I I I I I : I : I : : I : : I I 

Db 227 NIFSFYL SRDPDAQPGGELMLGGTDSKYYKGSLSYLNVTRKAYWQVHLDQVE 278 

Qy 269 I-NGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQ 327 

: : I I III : : I I I : I I : : I hi I : II 


Db 


279 VASGLTL CKE — GCEAI VDT GT S LMVG P VDE VRE LQKAI GAVP L I Q GEY 325 


Qy 328 LV-CWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAIS 386 

: : I : I I I : I I I : : : : : | : | | : I : 

Db 326 MIPCEKVST LPAITLKLGG KGYKLS — PEDYTLKVSQAGKTL — CLSGFMG 372 

Qy 387 Q SSTGTVMGAVIMEGFYWFDRARKRIGFAVSA 419 

I : : I I : : I I I I I I : I I I : I 

Db 373 MDIPPPSGPLWILGDVFIGRYYTVFDRDNNRVGFAEAA 410 


RESULT 13 
PEP2 RABIT 


ID PEP2_RABIT STANDARD; PRT; 387 AA. 

AC P27821; 

DT 01-AUG-1992 (Rel. 23, Created) 

DT 01-AUG-1992 (Rel. 23, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Pepsin II-2/3 precursor (EC 3.4.23.1) (Pepsin A). 

OS Oryctolagus cuniculus (Rabbit) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 

OX NCBI_TaxID=998 6; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=9 1009127; PubMed=2129536; 

RA Kageyama T., Tanabe K., Koiwai O.; 

RT "Structure and development of rabbit pepsinogens. Stage-specific 

RT zymogens, nucleotide sequences of cDNAs, molecular evolution, and 

RT gene expression during development."; 

RL J. Biol. Chem. 265:17031-17038(1990). 

CC -!- FUNCTION: SHOWS PARTICULARLY BROAD SPECIFICITY; ALTHOUGH BONDS 

CC INVOLVING PHENYLALANINE AND LEUCINE ARE PREFERRED, MANY OTHERS ARE 

CC ALSO CLEAVED TO SOME EXTENT. 

CC -!- CATALYTIC ACTIVITY: Preferential cleavage: hydrophobic, preferably 

CC aromatic, residues in PI and PI 1 positions. Cleaves 1-Phe- | -Val-2 , 

CC 4-Gln- | -His-5, 13-Glu- I -Ala- 14 , 14-Ala- I -Leu- 15 , 15-Leu- I -Tyr-16, 

CC 16-Tyr-|-Leu-17, 23-Gly- | -Phe-24 , 24-Phe- | -Phe-25 and 25-Phe-|- 

CC Tyr-26 bonds in the B chain of insulin. 

CC -!- DEVELOPMENTAL STAGE: PEPSINOGENS IN GROUP I, II, AND III WHERE 

CC THE PREDOMINANT ZYMOGENS AT LATE POSTNATAL STAGE. 

CC -!- MISCELLANEOUS: THE EXPRESSION OF PEPSINOGEN GENES IS REGULATED BY 

CC HORMONES AND RELATED SUBSTANCES. 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; M59235; AAA85369.1; -. 

DR HSSP; P00790; 1PSN. 

DR MEROPS; A01.001; 

DR InterPro; IPR001969; Aspprotease_site . 


DR 

InterPro; 

IPR001461; AspproteaseAl. 

DR 

Pfam; PF00026; asp; 1. 

DR 

PRINTS; PR00792; 

PEPSIN. 

DR 

PROSITE; 

PS00141; 

ASP PROTEASE; 2. 

KW 

Hydrolase 

; Aspartyl protease; Digestion; Zymogen; Signal; 

KW 

Phosphorylation; 

Multigene family. 

FT 

SIGNAL 

1 

15 

FT 

PROPEP 

16 

59 ACTIVATION PEPTIDE. 

FT 

CHAIN 

60 

387 PEPSIN II-2/3. 

FT 

MOD_RES 

129 

129 PHOSPHORYLATION (POTENTIAL) . 

FT 

ACTJ3ITE 

93 

93 BY SIMILARITY. 

FT 

ACT_SITE 

276 

276 BY SIMILARITY. 

FT 

DISULFID 

106 

Ill BY SIMILARITY. 

FT 

DISULFID 

267 

271 BY SIMILARITY. 

FT 

DISULFID 

310 

343 BY SIMILARITY. 

SQ 

SEQUENCE 

387 AA; 42100 MW; 66FC331A3DC75891 CRC64; 


Query Match 11.2%; Score 299; DB 1; Length 387; 

Best Local Similarity 26.9%; Pred. No. 1.5e-16; 

Matches 97; Conservative 64; Mismatches 134; Indels 66; 


Gaps 


13; 


Qy 

Db 


75 Y YVEMT VG S P P QT LN I LVDT G S S N FAVGAAP H P F — LHRYYQRQLSSTYRDLRKG 127 

I : : : : I : I I I :: I I I I I I I I : I I : : : I I I I : : 

75 YFGTISIGTPPQDFTVIFDTGSSNLWV PSTYCSSLACALHKRFNPEDSSTYQGTSET 131 


QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 


128 VYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEI 187 

: : I I Mill: : : : II : : : I I I I I I I I 

132 LSITYGTGSMTGILGYDTVKVGSIEDTNQIFGLSKTEPSLTFLF — APFDGILGLAYPSI 189 

18 8 ARPDDSLEPFFDSLVKQTHV-PNLFSLHLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYT 24 6 

: I : III:: : I : I I I : : I I : : I I I I I I I 

190 SSSDAT — PVFDNMWNEGLVSQDLFSVYLSSDD EKGS LVMFGGI DS S YYT 237 

24 7 GSLWYTPIRREWYYEVIIVRVEINGQDLKM — DCKEYNYDKSIVDSGTTNLRLPKKVFEA 304 

I I I : I : II:::: I I I I : : I : : I I I : I I : I I 
23 8 G S LNWVP VS YE G YWQ I TMD S VS I N G ET I ACAD S C QAIVDTGTSLLTGP TS 2 87 

305 AVKSIKAASSTEKFPDGFWLGEQLV-CWQAGTTPWNIFPVISLYLMGEVTNQSFRITILP 363 

I : : I : : I I I I : : I : I : I II 
28 8 AISNIQSYIGASK NLLGENVISCSAIDSLPDIVF TING 325 

364 QQYLRPVEDVATSQDDCYKFAISQSSTGT VMGAVIMEGFYWFDRARKRIGFAV 417 

III : I I : : I : : I I : : : II I I I : : I I 

326 IQYPLPASAYILKEDDDCTSGLEGMNVDTYTGELWILGDVFIRQYFTVFDRANNQLGLAA 385 

418 S 418 

386 A 386 


RESULT 14 
PEP4_RABIT 

ID PEP4_RABIT STANDARD; PRT; 387 AA. 

AC P28713; 

DT 01-DEC-1992 (Rel. 24, Created) 

DT 01-DEC-1992 (Rel. 24, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 


DE 
OS 
OC 
OC 
OX 
RN 
RP 
RX 
RA 
RT 
RT 
RT 
RL 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
KW 


FT 
FT 
FT 
FT 
FT 


Pepsin II-4 precursor (EC 3.4.23.1) (Pepsin A). 
Oryctolagus cuniculus (Rabbit) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 
NCBI_TaxID=998 6; 
[1] 

SEQUENCE FROM N . A. 

MEDLINE-91009127; PubMed=2 12 9536 ; 
Kageyarna T., Tanabe K. , Koiwai O. ; 

"Structure and development of rabbit pepsinogens. Stage-specific 
zymogens, nucleotide sequences of cDNAs , molecular evolution, and 
gene expression during development."; 
J. Biol. Chem. 265:17031-17038(1990). 

-!- FUNCTION: SHOWS PARTICULARLY BROAD SPECIFICITY; ALTHOUGH BONDS 

INVOLVING PHENYLALANINE AND LEUCINE ARE PREFERRED, MANY OTHERS ARE 
ALSO CLEAVED TO SOME EXTENT. 

-!- CATALYTIC ACTIVITY: Preferential cleavage: hydrophobic, preferably 
aromatic, residues in PI and PI 1 positions. Cleaves 1-Phe- | -Val-2 , 
4-Gln- | -His-5, 13-Glu- I -Ala-14, 14 -Ala- I -Leu-15, 15-Leu- | -Tyr-16, 
16-Tyr-| -Leu-17, 23-Gly- | -Phe-24 , 24-Phe- | -Phe-25 and 25-Phe-|- 
Tyr-26 bonds in the B chain of insulin. 

DEVELOPMENTAL STAGE: PEPSINOGENS IN GROUP I, II, AND III WHERE 
THE PREDOMINANT ZYMOGENS AT LATE POSTNATAL STAGE. 

MISCELLANEOUS: THE EXPRESSION OF PEPSINOGEN GENES IS REGULATED BY 
HORMONES AND RELATED SUBSTANCES. 
SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 
PIR; D38302; D38302. 
HSSP; P00790; 1PSN. 
MEROPS; A01.001; 

Inter Pro; IPR001969; Aspprotease_site . 

InterPro; IPR001461; AspproteaseAl . 

Pfam; PF00026; asp; 1. 

PRINTS; PR00792; PEPSIN. 

PROSITE; PS00141; ASP_PROTEASE; 2. 

Hydrolase; Aspartyl protease; Digestion; Zymogen; Signal; 
Phosphorylation; Multigene family. 


SIGNAL 

1 

15 



PROPEP 

16 

59 

ACTIVATION PEPTIDE. 


CHAIN 

60 

387 

PEPSIN II-4. 


MOD RES 

129 

129 

PHOSPHORYLATION (POTENTIAL) . 

ACT_SITE 

93 

93 

BY SIMILARITY. 


ACT_SITE 

276 

276 

BY SIMILARITY. 


DISULFID 

106 

111 

BY SIMILARITY. 


DISULFID 

267 

271 

BY SIMILARITY. 


DISULFID 

310 

343 

BY SIMILARITY. 


) SEQUENCE 

387 AA; 

42052 

MW; 21ADD07782A8 9585 

CRC64; 

Query Match 


11.2^ 

h; Score 298; DB 1; 

Length 387; 

Best Local Similarity 

26.1^ 

h; Pred. No. 1.8e-16; 


Matches 97; 

Conservative 

66; Mismatches 122; 

Indels 


J6; Gaps 14; 


Qy 

Db 

Qy 


75 Y YVEMT VG S P P QT LN I LVDT G S S N FAVGAAPH P F LHRYYQRQLS ST YRDLRKG 127 

I : : : : I : I I I : : I I I I M i I : I I : : : I I I I : : 

75 YFGTI S I GT PPQDFTVI FDTGS SNLWV PSTYCSSLACALHKRFNPEDSSTYQGTSET 131 


128 VYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFF- 
: : I I I I I I I : : | : : : : | 


-INGSNWE 177 


Db 


132 LSITYGTGSMTGILGYDTV KVGSIEDTNQIFGLSKTEPGLTFLFAPFD 179 


Qy 178 GILGLAYAEIARPDDSLEPFFDSLVKQTHV- PNLFSLHLCGAGFPLNQSEVLASVGGSMI 236 

I I I I I I I I : I : III:: : I : I I I : : I I 

Db 180 GI LGLAYPS I S S SDAT — PVFDNMWNEGLVSQDLFSVYLS SDD EKGSLVM 227 

Qy 237 IGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKM — DCKEYNYDKSIVDSGTTN 294 

I I I I I I I I I I : I : II:::: I I I I : : I : : | | | : I I : 

Db 22 8 FGGI DS S YYTGSLNWVPVS YEGYWQITMDSVS INGETI ACADSC QAIVDTGTSL 281 

Qy 295 LRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLV-CWQAGTTPWNIFPVISLYLMGEVT 353 

II : I : : I : : I I I I : : I : I : I 

Db 282 LTGP TSAISNIQSYIGASK NLLGENVTSCSAIDSLPDIVF 321 

Qy 354 NQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGT VMGAVIMEGFYWFD 4 07 

II III : I I : : I : : | | : : : I I I 

Db 322 TINGIQYPLPASAYILKEDDDCTSGLEGMNVDTYTGELWILGDVFIRQYFTVFD 375 

Qy 408 RARKRI GFAVS 418 

II : : I I : 
Db 376 RANNQLGLAAA 386 


RESULT 15 
CATD__RAT 

ID CATD_RAT STANDARD; PRT; 4 07 AA. 

AC P24268; 

DT 01-MAR-1992 (Rel. 21, Created) 

DT 01-MAR-1992 (Rel. 21, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Cathepsin D precursor (EC 3.4.23.5). 

GN CTSD. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; TISSUE=Pituitary; 

RX MEDLINE=91057150; PubMed=2243 802 ; 

RA Birch N.P., Loh Y.P.; 

RT "Cloning, sequence and expression of rat cathepsin D."; 

RL Nucleic Acids Res. 18:6445-6445(1990). 

RN [2] 

RP SEQUENCE FROM N.A. , AND SEQUENCE OF 65-74; 118-127 AND 165-174. 

RC TISSUE=Liver; 

RX MEDLINE=9 135424 9; PubMed=l 8 8 3350 ; 

RA Fujita H., Tanaka Y., Noguchi Y., Kono A., Himeno M. , Kato K. ; 

RT "Isolation and sequencing of a cDNA clone encoding rat liver 

RT lysosomal cathepsin D and the structure of three forms of mature 

RT enzymes . " ; 

RL Biochem. Biophys . Res. Commun. 179:190-196(1991). 

RN [3] 

RP SEQUENCE OF 134-170. 

RX MEDLINE=89034127; PubMed-3182 800 ; 

RA Yonezawa S., Takahashi T., Wang X., Wong R.N.S., Hartsuck J. A., 

RA Tang J. ; 


RT "Structures at the proteolytic processing region of cathepsin D."; 

RL J. Biol. Chem. 263:16504-16511(1988). 

CC -!- FUNCTION: Acid protease active in intracellular protein breakdown. 

CC -!- CATALYTIC ACTIVITY: Specificity similar to, but narrower than, 
CC that of pepsin A. Does not cleave the 4-Gln- | -His-5 bond in B 

CC chain of insulin. 

CC -!- SUBUNIT: OCCURS AS A MIXTURE OF BOTH A SINGLE CHAIN FORM AND TWO 

CC TYPES OF TWO CHAIN (LIGHT AND HEAVY) FORMS. 

CC -!- SUBCELLULAR LOCATION: Lysosomal. 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X54467; CAA38349.1; -. 

DR PIR; S13111; KHRTD. 

DR HSSP; P07339; 1LYB. 

DR MEROPS; A01.009; -. 

DR InterPro; IPR001969; Aspprotease_site . 

DR InterPro; IPR001461; AspproteaseAl . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Hydrolase; Aspartyl protease; Glycoprotein; Zymogen; Signal; 

KW Lysosome. 


FT 

SIGNAL 

1 

20 

POTENTIAL. 


FT 

PROPEP 

21 

64 

ACTIVATION PEPTIDE 

(POTENTIAL) . 

FT 

CHAIN 

65 

407 

CATHEPSIN D. 


FT 

CHAIN 

65 

164 

CATHEPSIN D 12 kDa 

LIGHT CHAIN. 

FT 

CHAIN 

165 

407 

CATHEPSIN D 30 kDa 

HEAVY CHAIN . 

FT 

CHAIN 

65 

117 

CATHEPSIN D 9 kDa : 

LIGHT CHAIN. 

FT 

CHAIN 

118 

407 

CATHEPSIN D 34 kDa 

HEAVY CHAIN . 

FT 

ACT_SITE 

97 

97 

BY SIMILARITY. 


FT 

ACT SITE 

290 

290 

BY SIMILARITY. 


FT 

DISULFID 

91 

160 

BY SIMILARITY. 


FT 

DISULFID 

110 

117 

BY SIMILARITY. 


FT 

DISULFID 

281 

285 

BY SIMILARITY. 


FT 

DISULFID 

324 

361 

BY SIMILARITY. 


FT 

CARBOHYD 

134 

134 

N-LINKED (GLCNAC. 

. .) (POTENTIAL) 

FT 

CARBOHYD 

258 

258 

N-LINKED (GLCNAC. 

. .) (POTENTIAL) 

FT 

CONFLICT 

15 

15 

D -> A (IN REF. 2) 


FT 

CONFLICT 

163 

163 

D -> T (IN REF. 3) 


FT 

CONFLICT 

205 

205 

K -> N (IN REF. 2) 


FT 

CONFLICT 

2 62 

262 

K -> N (IN REF. 2) 


SQ 

SEQUENCE 

407 AA; 

44680 

MW; C423AD4104D95F84 

CRC64; 

Query Match 


11.1$ 

h; Score 297; DB 1; 

Length 4 07; 


Best Local Similarity 26.1%; Pred. No. 2.3e-16; 

Matches 118; Conservative 76; Mismatches 170; Indels 88; Gaps 20; 

Qy 6 PWLLLWMGAGVLPAHGTQHGIRLPLR SGLGGA--PLGLRLPRETDEEPEEP 54 

I : I I : I : I I : I I : I I I : : I I : I I : I I 


Db 4 PGVLLLI-LGLLDASSSAL-IRIPLRKFTSIRRTMTEVGGSVEDLILKGPITKYSMQSSP 61 

Qy 55 GRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFL 109 

• I:: I Hl= =1:111 == = I 

Db 62 RTKEPVSELLKNYLDAQ YYGEIGIGTPPQCFTWFDTGSSMLWVPSIHCKLLDIACW 118 

QY HO -HRYYQRQL5STYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDK 168 

II I I I I = I I llllhl :::: I : 

Db 119 VHHKYNSDKSSTYVKNGTSFDIHYGSGSLSGYLSQDTVSVP CKSDLGGIKVEKQ 172 

QY 169 FF INGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHV-PNLFSLHLCG 217 

I : : : I I I I : I I : : : I I I : | : | | | I : I I : | 

Db 173 IFGEATKQPGWFIAAKFDGILGMGYPFIS— VNKVLPVFDNLMKQKLVEKNIFSFYL-- 228 

QY 218 AGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMD 277 

= I I : : : I I I I I I I : I : I : : I : : : | : | : | : 

Db 229 NRDPTGQPGGELMLGGTDSRYYHGELSYLNVTRKAYWQVHMDQLEV-GSELTL- 28 0 

Qy 278 CKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLV-CWQAGTT 33 6 

II : : I I I : | | : I I : I : I I : | | : : | : : 

Db 281 CK — GGCEAIVDTGTSLLVGPVDEVKELQKAIGAVPLIQ GEYMIPCEKVSS- 329 

Qy 337 PWNI FPVI S LYLMGEVTNQS FRITI LPQQYLRPVEDVATSQDDCYKFAI S Q 387 

1:1: I I: : |::|: - I : :| 
Db 330 LPIITFKLGGQ NYELHPEKYILKVSQAGKT ICLSGFMGMDI PPP 373 

Qy 38 8 SSTGTVMGAVIMEGFYWFDRARKRIGFAVSA 419 

I : : I I : : I I I I I I : I I I : I 

Db 374 SGPLWILGDVFIGCYYTVFDREYNRVGFAKAA 405 

Search completed: January 21, 2004, 09:23:03 
Job time : 25.9063 sees 


